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HUMAN BLOOD PROTEINS EXPRESSED IN MONOCOT SEEDS 

Field of the invention * 

The present invention relates to human blood proteins produced in the seeds of 
5 monocot plants for use in making human and animal topical compositions and human 
therapeutic compositions. 

Background Of The invention 

Many human blood proteins are in short or limited supply due to the larger 
10 quantities required of the protein for positive therapeutic effect or possibly also due to 
the larger demand of these proteins by the world population of patients having the 
particular condition. It is also advantageous to produce blood proteins, normally 
extracted from blood products, from an alternative source such as crop plants. 
Production of blood proteins from plants mitigates contamination of the blood protein 
15 fraction with human viruses and other disease causative agents found in human or 
animal blood product fractions. 

Blood proteins such as hemoglobin, alpha- 1 -antitrypsin ("AAT"), fibrinogen, 
human serum albumin, thrombin, antibodies, blood coagulation factors (e.g. Factors V- 
XIII), and others are known to have therapeutic potential for a number of human 
20 conditions. 

Hemoglobin is the major blood component molecule transporting oxygen to cells. 
Mammmalian hemoglobins are tetrameric proteins made up of two a-like polypeptide 
subunits and two non-a (usually (3, y, or 5} subunits. These subunits differ in primary 
amino acid sequence, but have similar secondary and tertiary structures. Each globin 

25 subunit has associated with it, by noncovalent interaction, a Fe 2+ -porphyrin complex 
known as a heme group, to which oxygen binds. The predominant hemoglobin in adult 
erythrocytes is a2p2, known as hemoglobin Ai (HbA). Each hemoglobin tetramer has a 
molecular weight of 64kD and each a-like and p-like chain has a molecular weight of 
approximately 15.7kD (141 amino acids) and 16.5kD (146 amino acids) respectively. 

30 AAT belongs to the class of serpin inhibitors and is one of the major protease 

inhibitors in human plasma. AAT is a single 394 amino acid polypeptide having an 
approximate molecular mass of 52kD, and contains about 15% carbohydrate in the 
native human form of the molecule. Concentrations of AAT in human plasma range 
from 1000-3000 mg/L and in human milk range from 100 to 400 rng/L Its primary 
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physiological role is the inhibition neutrophil elastase, with an insufficiency leading to the 
development of pulmonary emphysema. Excess production of elastase activity leads to 
emphysema, hepatitis and a variety of skin disorders. While the binding affinity of AAT 
is highest for human neutrophil elastase, it also has affinity for pancreatic proteases 
5 such as chymotrypsin and trypsin. The current primary source for the treatment of AAT 
deficiency is isolating AAT from human blood plasma. 

Fibrinogen is involved in the blood coagulation cascade and is converted to fibrin 
by its interaction with the natural clotting agent thrombin. Fibrin is the major component 
of blood clotting. Mature human fibrinogen consists of two pairs of three independent 
10 polypeptide chains (a, (3 and y) that are linked together by 29 intra- and intermolecular 
disulfide bonds forming a native protein of 340kD and is present in human plasma at an 
approximate concentration of 2500 mg/L. Three-dimentional structural analysis of 
independent fibrinogen domains has provided detailed structural features giving 
important clues to human fibrinogen's multifunctional role. The fibrinogen polypeptides 
15 are approximately 72kD (a), 52kD (P) and48kD (y) respectively with the p polypeptide 
chain determining native molecule assembly. The structure of fibrinogen features a 
number of structural and functional domains containing multiple binding sites that 
facilitate interactions with itself, other proteins, certain cell types and allow fibrinogen to 
participate ina number of important physiological processes including blood coagulation, 
inflammation, angiogenesis, wound closure, artheriogenesis and tumorigenesis. Fibrin 
formation from a clotting standpoint is mediated by the interaction of native fibrinogen: 
with its natural clotting agents Factor XIII and thrombin in the presence of blood soluble 
calcium. 

Albumin is a transport protein molecule that carries out many functions in 
mammalian serum biology, notably that of a carrier of hormones and other soluble 
ligands from site to site, and other activities that contribute largely to general 
mammalian biochemistry. Human serum albumin {"HSA") is also the major protein 
component of blood being actively present at serum concentrations of approximately 
30,000-50,000 mg/L. HSA is a single polypeptide chain of 66,5kD that is initially 
synthesized as a prepro-albumin molecule in the liver and released from the 
endoplasmic reticulum after N-terminal and C-terminal Golgi processing. The resultant 
mature protein is 585 amino acids in length. It has been shown that the natural 
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preprosequnce of HSA can function in correct protein targeting/processing across a 
plant plasma membrane in transgenic tobacco leaves (Sijmons et a/, 1990). 

Prothrombin, a plasma glycoprotein, is the zymogen of the serine protease 
thrombin that catalyzes the conversion of fibrinogen to fibrin as well as several other 
5 reactions that may be important for blood coagulation. Prothrombin is a single 

polypeptide chain approximately 72,000 molecular weight in size. The complete human 
thrombin cDNA consists of 622 amino acid residues and includes a leader sequence of 
36 amino acid residues. Active thrombin has an apparent molecular weight of 36,000 
and is made up of two disulfide-linked polypeptide chains resulting from prothrombin 

10 cleavage. The proteolytic events leading to in vitro activation and conversion of human 
prothrombin to active thrombin have been extremely well characterized. 

Factors V-XIII are proteins (mostly proteases in their active states) that are 
involved in the 'intrinsic pathway' of the classical casade mechanism for blood 
coagulation. The majority of these molecules exist as precursors that are processed in 

1 5 an ordered sequence of transformations from inactive to catalytically active forms. 
Factor V is proacceierin (the accelerator globulin) while Factor VI is the activated form 
of Factor V. Factor VII is proconvertin, the plasma thromboplastin component, while 
Factors VIII (antihemophilic factor) and IX (Christmas antihemophilic factor) are both 
associated with the hemophilia disease state. Factors X (Stuart-Power factor), XI 

20 (plasma thromboplastin anticedent) and XII (Hageman factor) are all involved with the 
maturation/stabilization of thrombin. Factor XIII (fibrin stabilizing factor) is a plasma 
transglutaminase directly acting on fibrin during the clotting process. All these Factors 
are present at relatively low in serum plasma (0.001 to 50 mg/L). Other protein factors 
also involved in the blood coagulation cascade include Fletcher Factor (prekallikrein), 

25 Fitzgerald factor (kininogen) and von Willebrand Factor. 

Immunoglobulins (antibodies) present in humans act to confer resistance to a 
variety of pathogens to which a patient may have been exposed. Immunoglobulin 
molecules account for 15-20% of the mass in human serum and consist predominantly 
of IgG, IgM and IgA-type antibodies involved in fighting various infections that invade 

30 the blood system and potentially the rest of the body. IgG type antibodies are the most 
prevelant and exist at a serum concentration of between 6-18 g/L. The blood system 
also serves as a carrier directing these molecules to specific areas of the body to 
combat resulting infections and potential oncogenic targets. Mature antibodies consist 
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of two polypeptides (light and heavy chains) that must be expressed in eqimolar 
amounts and come together to form functional entities. The light chain (~25kD) is a 
protein of -210-240 amino acids in length while the heavy chain (~50kD) is a protein of 
-450-460 amino acids in length. Both light and heavy chains carry signal peptides for 
5 processing and secretion into the blood stream. Expression of monoclonal antibodies in 
plants is of particular interest, because it requires the expression of two genes, 
synthesis of two proteins and coerrect assembly of the tetrameric protein to result in a 
functional antibody. 

Initial studies of antibodies in plants focused on the igG antibodiy class (Hiatt et 
10 a/, 1989; Hiatt and Ma, 1992), but later studies explored the in plants expression of 
complex antibody molecules such as secretory IgA antibodies (4 genes) and more- 
complex antibody forms (Ma et a/, 1 995; Vine et a\, 2001 ). 

U.S. Patent Nos. 6,417,429, 5,959,177, 5,639,947 and 5,202,422, all related 
patents, disclose the production of antibody molecules in transgenic tobacco plant 
15 leaves. 

U.S. Patent No. 6,303,341 discloses the production of immunoglobulins 
containing protection proteins in tobacco plant leaves, stems, flowers and roots. 

, Published U.S. Patent Application U.S. 2002/01 74453 discloses the production of 

antibodies in the plastids of tobacco plants. 

20 Published U.S. Patent Application U.S. 2002/0046418 discloses a controlled 

environment agriculture bioreactor for the commercial production of heterologous 
proteins in transgenic plants. The specification discloses that production of mammalian 
blood proteins can be achieved. Example 7 discloses the production of human blood 
factors in the leaves of potato, tobacco and alfalfa plants. 

25 U.S. Patent No. 6,344,600 discloses the production of hemoglobin and 

myoglobin in tobacco plant leaves. Example X discloses the extraction and partial 
purification of recombinant hemoglobin from tobacco seeds. The expression was 
obtained by transformation of the coexpression plasmid pBIOC59, which was 
constructed to allow targeting in the chloroplasts, and contained for this purpose the 

30 transit peptide of the precursor of the small subunit of ribulose 1 ,5-diphpsphate 

carboxylase of Pisum sativum L Expression in seeds was reported to be at a maximum 
level of 0.05% recombinant hemoglobin relative to the total soluble proteins extracted. 
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Example XI of the '600 patent discloses the construction of plasmids containing 
one of the a or (3 chains of hemoglobin allowing constitutive expression or expression in 
the albumin in maize seeds. According to this disclosure, the constitutive or albumin- 
specific expression of the hemoglobin chains required the following regulatory 
5 sequences: one of three promoters allowing a constitutive expression ((i) the rice actin 
promoter followed by the rice actin intron, contained in the plasmid pAct1~F4; (ii) the 
35S double constitutive promoter of cauliflower mosaic virus; or (iii) the promoter of the 
maize \-zein gene contained in the plasmid py63) and one of two terminators ((i) the 
35S polyA terminator; or (ii) the NOS polyA terminator). No experiment or data is 
10 provided regarding transformation or expression of these plasmids in maize or maize 
seeds. 

U.S. Patent No. 5,767,363 discloses the use of a seed-specific promoter derived 
from ACP of Brassica napus, to affect and vary the expression of seed oils in rape and 
tobacco plants. The specification generically discloses that the seed-specific promoter 
15 can be used for the expression of pharmaceutical proteins, such as blood factors or 
human serum albumin, however no experimental data whatsoever is presented in this 
regard. 

Daniell et al (2001) is a review article discussing recent developments in the field 
of medical molecular farming, including the production of antibodies and proteins in 
20 plants. 

None of these patents or publications discloses the production of human blood 
proteins in monocot seeds in high yield. It is desirable to provide for the production of 
human blood proteins in high yield free from contaminating source agents in order to 
provide the patient population with sufficient supply of these proteins for use in treating 
25 humans with conditions treatable by administration of a particular blood protein. 

Summary Of The Invention 

In one aspect, the invention includes a method of producing a recombinant 
human blood protein in monocot plant seeds, comprising the steps of: 
30 (a) transforming a monocot plant cell with a chimeric gene comprising 

(i) a promoter from the gene of a maturation-specific monocot plant storage 
protein, 
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(ii) a first DNA sequence, operably linked to said promoter, encoding a 
monocot plant seed-specific signal sequence capable of targeting a polypeptide [inked 
thereto to a monocot plant seed endosperm cell, and 

(iii) a second DNA sequence, linked in translation frame with the first 
5 DNA sequence, encoding a human blood protein, wherein the first DNA sequence and 

the second DNA sequence together encode a fusion protein comprising an N-terminal 

signal sequence and the human blood protein; 

(b) growing a monocot plant from the transformed monocot plant cell for a 

time sufficient to produce seeds containing the human blood protein; and 
10 (c) harvesting the seeds from the plant, 

wherein the human blood protein constitutes at least 3.0% of the total soluble protein in 

the harvested seeds, 

The invention also includes a purified human blood protein obtained by the 

method. Preferably, the human blood protein comprises one or more plant glycosyl 
15 groups. 

The invention also provides a monocot plant seed product, preferably selected 
from whole seed, flour, extract and malt, prepared from the harvested seeds obtained 
by the method of the invention. Preferably, the human blood protein constitutes at least 
3.0% of the total soluble protein in the seed product. 
20 The invention further provides a composition comprising a purified human blood 

protein, preferably comprising at least one plant glycosyl group, and at least one 
pharmaceutical^ acceptable excipient or nutrient, wherein the human blood protein is 
produced in a monocot plant containing a nucleic acid sequence encoding the human 
blood protein and is purified from seed harvested from the monocot plant. The nutrient 
25 is from a source other than the monocot plant. The formulation can be used for 
parenteral, enteric, inhalation, intranasal or topical delivery. 

These and other objects and features of the invention will become more fully 
apparent when the following detailed description of the invention is read in conjunction 
. with the accompanying drawings and claims. 

30 

Brief Description of the Figures 

Figure 1 shows plasmids with constructs containing three codon-optimized genes 
encoding the fibrinogen polypeptides a (pAPI 398), p (pAPI 417) and y (pAPI 327) (SEQ 
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ID NO: 1-3), each under the control of the rice giutelin promoter Gt1 . These plasmids, 
including a plasmid (not shown) containing the hygromycin selectable marker, were 
bombarded into embryogenic rice callus to create transgenic rice plants expressing 
these three genes in mature rice seeds. v 
5 Figure 2 shows a Western blot analysis of transgenic rice lines expressing 

individual subunits of human fibrinogen. Lane 1, positive control, purified native human 
fibrinogen (obtained from the Red Cross) showing all three polypeptide chains; Lane 2, 
extract from Tapei 309, a non transgenic rice variety; Lane 3, molecular weight 
standard; Lane 4, rice seed extract expressing fibrinogen a chain; Lane 5, rice seed 

10 extract expressing fibrinogen p chain; Lane 6, rice seed extract expressing fibrinogen y 
chain. Total protein extract of rice seeds was performed in 2% SDS, 1M urea, 1% pMe 
and PBS pH 7.4. Fibrinogen polypeptides were detected using antibody recognizing all 
three chains or individual chains only. 

Figure 3 shows the simultraneous expression of the three fibrinogen polypeptide 

15 chains (a, p and y) in transgenic rice seeds and analyzed via Western blot analysis. 
Fibrinogen polypeptides and protein aggregates were detected using antibody 
recognizing all three chains. Figure 3A indicates total protein extracted from rice seeds 
under non-denaturing conditions (350 mM NaCI, PBS pH 7.4, 0.01% Tween-20/Trition 
X-1 00/CHAPS) and run on a non-denaturing 10% acrylamide gel. Lane 1, 1 pg purified 

20 human fibrinogen; Lanes 2 & 3, extracts from Tapei 309, a non-transgenic rice variety; 
Lane 4, molecular weight markers; Lanes 5 & 7, extracts from two transgenic rice lines 
where 1.0% pMe was included in the extraction buffer; lanes 6 & 8, extracts from two 
transgenic rice lines without pMe in the extraction buffer. Lanes 6 & 8 show large 
protein aggregates that were extracted under non-denaturing conditions from the 

25 transgenic lines that run at the approximate position of complexed native human 

fibrinogen. Figure 3B indicates total protein extracted from rice seeds in 2% SDS, 1M 
urea, 1% pMe and PBS pH 7.4, and run on SDS-PAGE. Lane 1, positive control, native 
human fibrinogen (obtained from the Red Cross) showing all three polypeptide chains; 
Lane 2, molecular weight standards; Lanes 3-5, three independent transgenic rice lines 

30 expressing all three fibrinigen polypeptides. 

Figure 4 shows the plasmid pAPI 250 expressing the codon-optimized gene for 
* alpha-1 -antitrypsin (AAT) (SEQ ID NO: 5) under the control of the rice giutelin promoter 
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Gt1. This plasrnid, along with a piasmid (not shown) containing the hygromycin 
selectable marker gene, was bombarded into embryogenic rice callus to create 
transgenic rice plants expressing AAT in mature rice seeds. 

Figure 5 shows Coomassie brilliant blue staining of aqueous phase extraction of 
5 transgenic rice grains expressing human recombinant AAT. Both untransformed (rice 
var. Kitaake) and transgenic rice seeds (-10 pooled R1 seed from five individual 
transgenic plants) were ground with PBS pH 7.4 buffer. The resulting extract was spun 
at 14,000 rpm at 4° C for 10 min. Supernatant was collected and -20 pg of this soluble 
protein extract was resuspended in sample loading buffer, and loaded onto a precast 
10 SDS-PAGE gel. Lane 1 , molecular weight protein markers; Lane 2, purified non- 

recombinant human AAT; Lane 3, extract from control non-transformed Kitaake variety. 
Between lanes 2 and 3, the results from the extracts of the five individual transgenic 
plants are shown. 

Figure 6 shows Western blot analysis of recombinant human AAT expressed in 
15 transgenic rice grains. The R1 pooled seed soluble protein extracts (-10 jug total 
protein) from seven independent transgenic ricfe transformants were prepared as 
described in Figure 5 above, separated by SDS-PAGE gel and then blotted onto a 
nitrocellulose filter. The identification of AAT expressed in rice seeds was carried out by 
Western analysis using anti-AAT antibody. Lane 1, molecular weight protein markers; 
20 Lanes 2 & 3, 1 jag and 2 fig, respectively, of purified non-recombinant human AAT; 
Lanes 4 & 5, control, non-transgenic rice extract (var. Kitaake). The final seven lanes 
show the results from the extracts of the seven individual transgenic plants. Extracts 
from two of the seven transgenic lines did not express AAT. The shift in gel mobility 
between the non-recombinant human and recombinant rice-expressed forms is due to 
25 the type and glycosylation differences in the human and recombinant rice-expressed 
proteins. 

Figure 7 shows activity of purified recombinant AAT (rAAT) obtained from rice 
extracts against purified porcine pancreatic elastase (PPE) as determined by 
Coomassie staining and Western blot analysis. The activity of rAAT is demonstrated by 
30 a band shift assay involving the AAT protease substrate, elastase. AAT samples from 
human and rice ectracts were incubated with equal number of moles of PPE at 37°C for 
15 min. Negative control for band shift assay was prepared with the AAT samples 
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incubated with equal volume of PPE added. Lane MW refers to molecular weight 
markers. Figure 7A: Lane 1 , purified non-recombinant AAT from human plasma; Lane 
2, purified AAT from human plasma + PPE; Lane 3, soluble protein extract containing 
AAT from transgenic rice seed; Lane 4, protein extract containing AAT from transgenic 
5 rice seed + PPE; Lane 5, non-transformed rice seed extract; Lane 6, non-transformed 
rice seed extract + PPE. Figure 7B shows a shifted band in Lanes 1, 2 and 3. The 
shifted band, a complex between PPE and an AAT fragment is confirmed to contain 
AAT by Western blot analysis. The lanes in Figure 7B are analogous to those in Figure 
7A. 

10 Figure 8 depicts AAT derived from rice cell extracts purified initially through Con- 

A and DEAE Sepharose respectively, then loaded onto an octyl Sepharose column. 
Octyl Sepharose is the final purification step and separates active AAT from an 
inactivated form of the protein. Lane 1, molecular weight markers; Lane 2, 2 fig. purified 
non-recombinant human AAT as a standard; Lane 3, pooled eluate from the DEAE 

15 Sepharose column. The remaining columns show the flow-through and the eluate from 
the octyl Sepharose column. Approximately 50 jaL from each column fraction was 
loaded onto an SDS-PAGE gel and the proteins visualized by Coomassie staining. 
Octyl Sepharose flow-through shows the inactive AAT protein while the eluate resolves 
active AAT. 

20 Figure 9A depicts an AAT association rate constant for activity of purified 

recombinant AAT against PPE determined (as described by the procedure in Figure 7) 
using non-recombinant human AAT as a control. Data were generated by Coomassie 
protein staining and Western blot analysis, as described in Figure 7. Figure 9B depicts 
the thermostability of plant-derived recombinant AAT versus native human AAT 

25 determend by the PPE inhibition assay. 

Figure 10 shows the plasmid pAPI 9 for expression of codon-optimized human 
serum albumin (HSA) (SEQ ID NO: 4) under the control of the rice Amy1 A 
promoter/signal peptide. This plasmid is useful for the expression of HSA in germinated 
rice seeds. 

30 Figure 1 1 shows the expression of HSA in transgenic rice seeds. Pooled seed 

from transgenic rice line 3-11-2 were imbibed in water for 24 hours, then 2 jxM gibbereiic 
acid (GA) was added. Seed samples were extracted at 24, 48, 72, and 120 hours post 
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GA addition and soluble proteins were extracted and prepared for Western analysis. 1 5 
jug of soluble protein were loaded onto each lane along with protein isolated from the 
non-transfromed negative control line TP309. The blot was probed with monoclonal 
antibody prepared against HSA. 

5 

Detailed Description of the invention 

Unless otherwise indicated, all terms used herein have the meanings given 
below or are generally consistent with same meaning that the terms have to those 
skilled in the art of the present invention. Practitioners are particularly directed to 

10 Sambrook ef a/. (1989) Molecular Cloning: A Laboratory Manual (Second Edition), Cold 
Spring Harbor Press, Plainview, N.Y., Ausubel FM et al. (1993) Current Protocols in 
Molecular Biology, John Wiley & Sons, New York, N.Y., and Geivin and Schilperoot , 
eds. (1997) Plant Molecular Biology Manual, Kluwer Academic Publishers, The 
Netherlands for definitions and terms of the art. 

15 The polynucleotides of the invention may be in the form of RNA or in the form of 

DNA, and include messenger RNA, synthetic RNA and DNA, cDNA, and genomic DNA. 
The DNA may be double-stranded or single-stranded, and if single-stranded may be the 
coding strand or the non-coding (anti-sense, complementary) strand. 

The term "stably transformed" with reference to a plant cell means the plant cell 

20 has a non-native (heterologous) nucleic acid sequence integrated into its genome which 
is maintained through two or more generations. 

By "host cell" is meant a cell containing a vector and supporting the replication 
and/or transcription and/or expression of the heterologous nucleic acid sequence. 
Preferably, according to the invention, the host cell is a monocot plant cell. Other host 

25 cells may be used as secondary hosts, including bacterial, yeast, insect, amphibian or 
mammalian cells, to move DNA to a desired plant host cell. 

A "plant cell" refers to any cell derived from a plant, including undifferentiated 
tissue (e.g., callus) as well as plant seeds, pollen, propagules, embryos, suspension 
cultures, meristematic regions, leaves, roots, shoots, gametophytes, sporophytes and 

30 microspores. 

The term "mature plant" refers to a fully differentiated plant. 
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The term "seed product" includes, but is not limited to, seed fractions such as de- 
hulled whole seed, flour (seed that has been de-hulled by milling and ground into a 
powder) a seed extract, preferably a protein extract (where the protein fraction of the 
flour has been separated from the carbohydrate fraction), malt (including malt extract or 
5 malt syrup) and/or a purified protein fraction derived from the transgenic grain. 

The term "biological activity" refers to any biological activity typically attributed to 
that protein by those of skill in the art. 

The term "blood protein" refers to one or more proteins, or biologically active 
fragments thereof, found in normal human blood, including, without limitation, 
10 hemoglobin, alpha-1 -antitrypsin, fibrinogen, human serum albumin, 

prothrombin/thrombin, antibodies, blood coagulation factors (Factor V, Factor VI, Factor 
VII, Factor VIII, Factor IX, Factor X, Factor XI, Factor XII, Factor XIII, Fletcher Factor, 
Fitzgerald Factor and von Willebrand Factor), and biologically active fragments thereof. 
The term "non-nutritional" refers to a pharmaceutical^ acceptable excipient 
15 which does not as its primary effect provide nutrition to the recipient. Preferably, it may 
provide one of the following services to an enterically delivered formulation, including 
acting as a carrier for a therapeutic protein, protecting the protein from acids in the 
digestive tract, providing a time-release of the active ingredients being delivered, or 
otherwise providing a useful quality to the fomulation in order to administer to the patient 
20 the blood proteins. 

"Monocot seed components" refers to carbohydrate, protein, and lipid 
components extractable from monocot seeds, typically mature monocot seeds. 

"Seed maturation" refers to the period starting with fertilization in which 
metabolizable reserves, e.g., sugars, oligosaccharides, starch, phenolics, amino acids, 
25 and proteins, are deposited, with and without vacuole targeting, to various tissues in the 
seed (grain), e.g., endosperm, testa, aieurone layer, and scutellar epithelium, leading to 
grain enlargement, grain filling, and ending with grain desiccation. 

"Maturation-specific protein promoter" refers to a promoter exhibiting 
substantially upregulated activity (greater than 25%) during seed maturation. 
30 "Heterologous DNA" refers to DNA which has been introduced into plant cells 

from another source, or which is from a plant source, including the same plant source, 
but which is under the control of a promoter that does not normally regulate expression 
of the heterologous DNA. 
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"Heterologous protein" is a protein encoded by a heterologous DNA. 

A "signal sequence" is an N- or C-termina! polypeptide sequence which is 
effective to localize the peptide or protein to which it is attached to a selected 
intracellular or extracellular region. Preferably, according to the invention, the signal 
5 sequence targets the attached peptide or protein to a location such as an endosperm 
cell, more preferably an endosperm-cell organelle, such as an intracellular vacuole or 
other protein storage body, chloroplast, mitochondria, or endoplasmic reticulum, or 
extracellular space, following secretion from the host cell. 

Expression vectors for use in the present invention are chimeric nucleic acid 
10 constructs (or expression vectors or cassettes), designed for operation in plants, with 
associated upstream and downstream sequences. 

In general, expression vectors for use in practicing the invention include the 
following operably linked components that constitute a chimeric gene: a promoter from 
the gene of a maturation-specific monocot plant storage protein, a first DNA sequence, 
15 operably linked to the promoter, encoding a monocot plant seed-specific signal 

sequence (such as an N-terminal leader sequence or a C-terminal trailer sequence) 
capable of targeting a polypeptide linked thereto to an endosperm cell, preferably an 
endosperm-cell organelle, such as a protein storage body, and a second DNA 
sequence, linked in translation frame with the first DNA sequence, encoding a human 
20 blood protein. The signal sequence is preferably cleaved from the human blood protein 
in the plant cell. 

The chimeric gene, in turn, is typically placed in a suitable plant-transformation 
vector having (i) companion sequences upstream and/or downstream of the chimeric 
gene which are of plasmid or viral origin and provide necessary characteristics to the 

25 vector to permit the vector to move DNA from bacteria to the desired plant host; (ii) a 
selectable marker sequence; and (iii) a transcriptional termination region generally at the 
opposite end of the vector from the transcription initiation regulatory region. 

Numerous types; of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of plant host cells. The promoter region is 

30 chosen to be regulated in a manner allowing for induction under seed-maturation 

conditions. In one aspect of this embodiment of the invention, the expression construct 
includes a promoter which exhibits specifically upregulated activity during seed 
maturation. Promoters for use in the invention are typically derived from cereals such 
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as rice, barley, wheat, oat, rye, corn, millet, triticale or sorghum. Examples of such 
promoters include the maturation-specific promoter region associated with one of the 
following maturation-specific monocot plant storage proteins: rice glutelins, oryzins, and 
prolamines, barley hordeins, wheat gliadins and glutelins, maize zeins and glutelins, oat 
5 glutelins, and sorghum kafirins, millet pennisetins, and rye secalins. Exemplary 
regulatory regions from these genes are exemplified by SEQ ID NOS: 6-14. Other 
promoters suitable for expression in maturing seeds include the barley endosperm- 
specific B1-hordein promoter, GIuB-2 promoter, Bx7 promoter, Gt3 promoter, GluB-1 
promoter and Rp-6 promoter, particularly if these promoters are used in conjunction with 

10 transcription factors. 

Of particular interest is the expression of the nucleic acid encoding a human 
blood protein from a promoter that is preferentially expressed in plant seed tissue. 
Examples of such promoter sequences include those sequences derived from 
sequences encoding plant storage protein genes or from genes involved in fatty acid 

15 biosynthesis in oilseeds. Exemplary preferred promoters include a glutelin (Gt1) 

promoter, as exemplified by SEQ ID NO: 6, which effects gene expression in the outer 
layer of the endosperm, and a globulin (Gib) promoter, as exernplfieid by SEQ ID NO: 7, 
which effects gene expression in the center of the endosperm. Promoter sequences for 
regulating transcription of gene coding sequences operably linked thereto include 

20 naturally-occurring promoters, or regions thereof capable of directing seed-specific 
transcription, and hybrid promoters, which combine elements of more than one 
promoter. Methods for construction such hybrid promoters are well known in the art. 

In some cases, the promoter is native to the same plant species as the plant ceils 
into which the chimeric nucleic acid construct is to be introduced. In other 

25 embodiments, the promoter is heterologous to the plant host cell. 

Alternatively, a seed-specific promoter from one type of monocot may be used 
regulate transcription of a nucleic acid coding sequence from a different monocot or a 
non-cereal monocot. 

In addition to encoding the protein of interest, the expression cassette or 

30 heterologous nucleic acid construct includes DNA encoding a signal peptide that allows 
processing and translocation of the protein, as appropriate. Exemplary signal 
sequences are those sequences associated with the monocot maturation-specific 
genes: glutelins, prolamines, hordeins, gliadins, glutenins, zeins, albumin, globulin, ADP 
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glucose pyrophosphorylase, starch synthase, branching enzyme, Em, and lea. 
Exemplary sequences encoding a signal peptide for a protein storage body are 
identified herein as SEQ ID NOS: 15-21. 

In one preferred embodiment, the method is directed toward the localization of 
5 proteins in an endosperm cell, preferably an endosperm-cell organelle, such as a 
protein storage body, mitochondrion, endoplasmic reticulum, vacuole, chloroplast or 
other plastidic compartment. For example, when proteins are targeted to plastids, such 
as chloropiasts, in order for expression to take place the construct also employs the use 
of sequences to direct the gene product to the plastid. Such sequences are referred to 

10 herein as chloroplast transit peptides (CTP) or plastid transit peptides (PTP). In this 
manner, when the gene of interest is not directly inserted into the plastid, the expression 
construct additionally contains a gene encoding a transit peptide to direct the gene of 
interest to the plastid. The chloroplast transit peptides may be derived from the gene of 
interest, or may be derived from a heterologous sequence having a CTP. Such transit 

15 peptides are known in the art. See, for example, (Smeekens et al., 1986; Wasmann et 
a/., 1986; Von Heijne et al., 1991 , U.S. patents 4,940,835 and 5,728,925; . Additional 
transit peptides for the translocation of the protein to the endoplasmic reticulum (ER) 
(Chrispeels, 1991; Vitale and Chrispeels, 1992), nuclear localization signals (Shieh et 
al, 1993; Varagona et al., 1992)or vacuole (Raikhel and Chrispeels 1992; Bednarek 

20 and Raikel, 1992; also see U.S. Patent No 5,360,726) may also find use in the 
constructs of the present invention. 

Another exemplary class of signal sequences are sequences effective to promote 
secretion of heterologous protein from aleurone cells during seed germination, including 
the signal sequences associated with alpha-amylase, protease, carboxypeptidase, 

25 endoprotease, ribonuclease, DNase/RNase, (1-3)-beta-glucanase, (1-3)(1~4)-beta~ 
glucanase, esterase, acid phosphatase, pentosamine, endoxylanase, p- 
xylopyranosidase, arabinofuranosidase, beta-glucosidase, (1-6)-beta-glucanase, 
peroxidase, and lysophospholipase. 

Since many protein storage proteins are under the control of a maturation- 

30 specific promoter, and this promoter is operably linked to a signal sequence for 

targeting to a protein body, the promoter and signal sequence can be isolated from a 
single protein-storage gene, then operably linked to a blood protein in the chimeric gene 
construction. One preferred and exemplary promoter-signal sequence is from the rice 
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Gt1 gene, having an exemplary sequence identified by SEQ ID NO: 6. Alternatively, the 
promoter and leader sequence may be derived from different genes. One preferred and 
exemplary promoter-signal. sequence combination is the rice Gib promoter linked to the 
rice Gt1 leader sequence, as exemplified by SEQ ID NO: 7. 
5 Preferably, expression vectors or heterologous nucleic acid constructs designed 

for operation in plants comprise companion sequences upstream and downstream to 
the expression cassette. The companion sequences are of plasmid or viral origin and 
provide necessary characteristics to the vector to permit the vector to move DNA from a 
secondary host to the plant host, such as, sequences containing an origin of replication 

10 and a selectable marker. Typical secondary hosts include bacteria and yeast. 

In one embodiment, the secondary host is E. co//, the origin of replication is a 
CoIEl-type, and the selectable marker is a gene encoding ampicillin resistance. Such 
sequences are well known in the art and are commercially available as well (e.g., 
Clontech, Palo Alto, Calif.; Stratagene, La Jolla, CA). 

15 The transcription termination region may be taken from a gene where it is normally 

associated with the transcriptional initiation region or may be taken from a different gene. 
Exemplary transcriptional termination regions include the NOS terminator from 
Agrobacterium Ti plasmid and the rice a-amylase terminator. 

Polyadenylation tails may also be added to the expression cassette to optimize 

20 high levels of transcription and proper transcription termination, respectively. 

Polyadenylation sequences include, but are not limited to, the Agrobacterium octopine 
synthetase signal, or the nopaline synthase of the same species. 

Suitable selectable markers for selection in plant cells include, but are not limited 
to, antibiotic resistance genes, such as, kanamycin (npfll), G418, bleomycin, 

25 hygromycin, chloramphenicol, ampicillin, tetracycline, and the like. Additional selectable 
markers include a bar gene which codes for bialaphos resistance; a mutant EPSP 
synthase gene which encodes glyphosate resistance; a nitrilase gene which confers 
resistance to bromoxynil; a mutant acetolactate synthase gene (ALS) which confers 
imidazolinone or sulphonyiurea resistance; and a methotrexate resistant DHFR gene. 

30 The particular marker gene employed is one which allows for selection of 

transformed cells as compared to cells lacking the DNA which has been introduced. 
Preferably, the selectable marker gene is one which facilitates selection at the tissue 
culture stage, e.g., a kanamyacin, hygromycin or ampicillin resistance gene. 
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The vectors of the present invention may also be modified to include intermediate 
plant transformation plasm ids that contain a region of homology to an Agrobacterium 
tumefaciens vector, a T-DNA border region from Agrobacterium tumefaciens, and 
chimeric genes or expression cassettes (described above). Further, the vectors of the 
5 invention may comprise a disarmed plant tumor inducing plasmid of Agrobacterium 
tumefaciens. 

In general, a selected nucleic acid sequence is inserted into an appropriate 
restriction endonuclease site or sites in the vector. Standard methods for cutting, 
ligating and transformation into a secondary host cell, known to those of skill in the art, 

10 are used in constructing vectors for use in the present invention. (See generally, 
Maniatis et a/., Ausubel et a/., and Gelvin et a/., supra.) 

Plant cells or tissues are transformed with expression constructs (heterologous 
nucleic acid constructs, e.g., plasmid DNA into which the gene of interest has been 
inserted) using a variety of standard techniques. Effective introduction of vectors in 

15 order to facilitate enhanced plant gene expression is an important aspect of the 

invention. It is preferred that the vector sequences be stably transformed, preferably 
integrated into the host genome. 

The method used for transformation of host plant cells is not critical to the 
present invention. The skilled artisan will recognize that a wide variety of transformation 

20 techniques exist in the art, and new techniques are continually becoming available. Any 
technique that is suitable for the target host plant may be employed within the scope of 
the present invention. For example, the constructs can be introduced in a variety of 
forms including, but not limited to, as a strand of DNA, in , a plasmid, or in an artificial 
chromosome. The introduction of the constructs into the target plant cells can be 

25 accomplished by a variety of techniques, including, but not limited to calcium- 
phosphate-DNA co-precipitation, electroporation, microinjection, Agrobacterium- 
mediated transformation, liposome-mediated transformation, protoplast fusion or 
microprojectile bombardment (Christou, 1992; Sanford et a/., 1993). The skilled artisan 
can refer to the literature for details and select suitable techniques for use in the 

30 methods of the present invention. 

When Agrobacterium is used for plant cell transformation, a vector is introduced 
into the Agrobacterium host for homologous recombination with T-DNA or the Ti- or Ri- 
plasmid present in the Agrobacterium host The Ti- or Ri-plasmid containing the T-DNA 
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for recombination may be armed (capable of causing gall formation) or disarmed 
(incapable of causing gall formation), the latter being permissible, so long as the vir 
genes are present in the transformed Agrobacterium host. The armed plasmid can give 
a mixture of normal plant cells and gall. 
5 In some instances where Agrobacterium is used as the vehicle for transforming 

host plant cells, the expression or transcription construct bordered by the T-DNA border 
region(s) is inserted into a broad host range vector capable of replication in E. coli and 
Agrobacterium, examples of which are described in the literature, for example pRK2 or 
derivatives thereof. See, for example, Ditta etai, 1980 and EPA 0 120 515. 

10 Alternatively, one may insert the sequences to be expressed in plant cells into a vector 
containing separate replication sequences, one of which stabilizes the vector in E. coli, 
and the other in Agrobacterium, See, for example, McBride and Summerfelt 1 990, 
wherein the pRiHRI (Jouanin, et a/., 1985), origin of replication is utilized and provides 
for added stability of the plant expression vectors in host Agrobacterium cells. 

15 Included with the expression construct and the T-DNA is one or more selectable 

marker coding sequences which allow for selection of transformed Agrobacterium and 
transformed plant cells. A number of antibiotic resistance markers have been 
developed for use with plant cells, these include genes inactivating antibiotics such as 
kanamycin, the aminoglycoside G418, hygromycin, or the like. The particular marker 

20 employed is not essential to this invention, with a particular marker preferred depending 
on the particular host and the manner of construction. 

For Agrobacterium-me6\a\ed transformation of plant cells, explants are incubated 
with Agrobacterium for a time sufficient to result in infection, the bacteria killed, and the 
plant cells cultured in an appropriate selection medium. Once callus forms, shoot 

25 formation can be encouraged by employing the appropriate plant hormones in 
accordance with known methods and the shoots transferred to rooting medium for 
regeneration of plants. The plants may then be grown to seed and the seed used to 
establish repetitive generations and for isolation of the recombinant protein produced by 
the plants. 

30 There are a number of possible ways to obtain plartt cells containing more than 

one expression construct. In one approach, plant ceils are co-transformed with a first 
and second construct by inclusion of both expression constructs in a single 
transformation vector or by using separate vectors, one of which expresses desired 
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genes. The second construct can be introduced into a plant that has already been 
transformed with the first expression construct, or alternatively, transformed plants, one 
having the first construct and one having the second construct, can be crossed to bring 
the constructs together in the same plant 
5 In a preferred embodiment, the plants used in the methods of the present 

invention are derived from members of the taxonomic family known as the Gramineae.* 
This includes all members of the grass family of which the edible varieties are known as 
cereals. The cereals include a wide variety of species such as wheat (Thticum sps.), 
rice {Oryza sps.) barley (Hordeum sps.) oats, {Avena sps.) rye (Secale sps.), corn 
10 (maize) {Zea sps.) and millet (Pennisettum sps.). In practicing the present invention, 
preferred grains are rice, wheat, maize, barley, rye and triticale, and most preferred is 
rice. 

In order to produce transgenic plants that express human blood protein in seeds, 
monocot plant cells or tissues derived from them are transformed with an expression 

1 5 vector comprising the coding sequence for a human blood protein. The transgenic plant 
cells are cultured in medium containing the appropriate selection agent to identify and 
select for plant cells which express the heterologous nucleic acid sequence. After plant 
cells that express the heterologous nucleic acid sequence are selected, whole plants 
are regenerated from the selected transgenic plant cells. Techniques for regenerating 

20 whole plants from transformed plant cells are generally known in the art. 

Transgenic plant lines, e.g., rice, wheat, corn or barely, can be developed and genetic 
crosses carried out using conventional plant breeding techniques. 

Transformed plant cells are screened for the ability to be cultured in selective 
media having a threshold concentration of a selective agent. Plant cells that grow on or 

25 in the selective media are typically transferred to a fresh supply of the same media and 
cultured again. The explants are then cultured under regeneration conditions to 
produce regenerated plant shoots. After shoots form, the shoots are transferred to a 
selective rooting medium to provide a complete plantlet. The plantlet may then be 
grown to provide seed, cuttings, or the like for propagating the transformed plants. The 

30 method provides for efficient transformation of plant cells and regeneration of transgenic 
plants, which can produce a recombinant human blood protein. 

The expression of the recombinant human blood protein may be confirmed using 
standard analytical techniques such as Western blot, ELISA, PGR, HPLC, NMR, or 
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mass spectroscopy, together with assays for a biological activity specific to the 
particular protein being expressed. 

A purified blood protein recombinantly produced in a plant cell, preferably 
substantially free of contaminants of the host plant cell, and preferably comprising at 
5 least one plant glycosyl group is also provided by the invention. The plant glycosyl 
groups, while identifying that the blood protein was produced in a plant, does not 
significantly impair the biological activity of the blood protein in any of the applied 
therapeutic contexts (preferably less than 25% loss of activity, more preferably less than 
10% loss of activity, as compared to a corresponding non-recombinant human blood 

10 protein). Typically, in accordance with some embodiments of the invention, the human 
blood protein constitutes at least about 0.5%, at least about 1.0% or at least about 2.0% 
of the total soluble protein in the seeds harvested from the transgenic plant. In a 
preferred embodiment, however, protein expression is much higher than previously 
reported, i.e., at least about 3.0%, which makes commercial production quite feasible. 

15 Advantageously, protein expression is at least about 5.0%, at least about 10%, at least 
about 15%, at least about 20%, at least about 30%, or even at least about 40% of total 
soluble protein. 

The invention includes plant seed product prepared from the harvested seeds. 
Preferably, the human blood protein constitutes at least about 3.0% of the total soluble 

20 protein in the seed product, more preferably at least about 5.0%, and most preferably at 
least about 10.0%. As shown in the figures, the expression of human blood proteins in 
rice grains, represented by AAT, the three fibrinogen polypeptides and HSA represent 
at least about 10% of total soluble protein. 

The present invention also provides compositions comprising human blood 

25 proteins produced recombinantly in the seeds of monocot plants, and methods of 
making such compositions. In practicing the invention, a human blood protein is 
produced in the seeds or grain of transgenic plants that express the nucleic acid coding 
sequence for the human blood protein. After expression, the blood protein may be 
provided to a patient in substantially unpurified form (i.e., at least 20% of the 

30 composition comprises plant material), or the blood protein may be isolated or purified 
from a product of the mature seed (e.g., flour, extract, malt or whole seed, etc.) and 
formulated for delivery to a patient. 
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Such compositions can comprise a formulation for the type of delivery intended. 
Delivery types can include, e.g. parenteral, enteric, inhalation, intranasal or topical 
delivery. Parenteral delivery can include, e.g. intravenous, intramuscular, or 
suppository. Enteric delivery can include, e.g. oral administration of a pill, capsule, or 
5 other formulation made with a non-nutritional pharmaceutically-acceptable excipient, or 
a composition with a nutrient from the transgenic plant, for example, in the grain extract 
in which the protein is made, or from a source other than the transgenic plant. Such 
nutrients include, for example, salts, saccharides, vitamins, minerals, amino acids, 
peptides, and proteins other than the human blood protein. Intranasal and inhalant 

10 delivery systems can include spray or aerosol in the nostrils or mouth. Topical delivery 
can include, e.g. creams, topical sprays, or salves. Preferably, the composition is 
substantially free of contaminants of the transgenic plant, preferably containing less 
than 20% plant material, more preferably less than 10%, and most preferably, less than 
5%. The preferable route of administration is enteric, and preferably the composition is 

15 non-nutrititional. 

The blood protein can be purified from the seed product by a mode including 
grinding, filtration, heat, pressure, salt extraction, evaporation, or chromatography. 

The human blood proteins produced in accordance with the invention also 
include all variants thereof, whether allelic variants or synthetic variants. A "variant" 

20 human blood protein-encoding nucleic acid sequence may encode a variant human 
blood protein amino acid sequence that is altered by one or more amino acids from the 
native blood protein sequence, preferably at least one amino acid substitution, deletion 
or insertion. The nucleic acid substitution, insertion or deletion leading to the variant 
may occur at any residue within the sequence, as long as the encoded amino acid 

25 sequence maintains substantially the same biological activity of the native human blood 
protein. In another embodiment, the variant human blood protein nucleic acid sequence 
may encode the same polypeptide as the native sequence but, due to the degeneracy 
of the genetic code, the variant has a nucleic acid sequence altered by one or more 
bases from the native polynucleotide sequence. 

30 The variant nucleic acid sequence may encode a variant amino acid sequence 

that contains a "conservative" substitution, wherein the substituted amino acid has 
structural or chemical properties similar to the amino acid which it replaces and 
physicochemical amino acid side chain properties and high substitution frequencies in 
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homplogous proteins found in nature (as determined, e.g., by a standard Dayhoff 
frequency exchange matrix or BLOSUM matrix). In addition, or alternatively, the variant 
nucleic acid sequence may encode a variant amino acid sequence containing a "non- 
conservative" substitution, wherein the substituted amino acid has dissimilar structural 
5 or chemical properties to the amino acid it replaces. Standard substitution classes 
include six classes of amino acids based on common side chain properties and highest 
frequency of substitution in homologous proteins in nature, as is generally known to 
those of skill in the art and may be employed to develop variant human blood protein- 
encoding nucleic acid sequences. 
10 As will be understood by those of skill in the art, in some cases it may be 

advantageous to use a human blood protein-encoding nucleotide sequences 
possessing non-naturally occurring codons. Codons preferred by a particular eukaryotic 
host can be selected, for example, to increase the rate of human blood protein 
expression or to produce recombinant RNA transcripts having desirable properties, 
1 5 such as a longer half-life, than transcripts produced from naturally occurring sequence. 
As an example, it has been shown that codons for genes expressed in rice are rich in 
guanine (G) or cytosine (C) in the third codon position (Huang etal., 1990). Changing 
low G + C content to a high G + C content has been found to increase the expression 
levels of foreign protein genes in barley grains (Horvath etal., 2000). The blood protein 
20 encoding genes employed in the present invention were synthesized by Operon 

Technologies (Alameda, CA) based on the rice gene codon bias (Huang etal., 1990) 
along with the appropriate restriction sites for gene cloning. These 'codon-optimized' 
genes were linked to regulatory/secretion sequences for seed-directed monocot 
expression and these chimeric genes then inserted into the appropriate plant 
25 transformation vectors. 

A human blood protein-encoding nucleotide sequence may be engineered in 
order to alter the human blood protein coding sequence for a variety of reasons, 
including but not limited to, alterations which modify the cloning, processing and/or 
expression of the human blood protein by a cell. 
30 Heterologous nucleic acid constructs may include the coding sequence for a 

given human blood protein (i) in isolation; (ii) in combination with additional coding 
sequences; such as fusion protein or signal peptide, in which the human blood protein 
coding sequence is the dominant coding sequence; (iii) in combination with non-coding 
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sequences, such as introns and control elements, such as promoter and terminator 
elements or 5' and/or 3' untranslated regions, effective for expression of the coding 
sequence in a suitable host; and/or (iv) in a vector or host environment in which the 
human blood protein coding sequence is a heterologous gene. 
5 Depending upon the intended use, an expression construct may contain the 

nucleic acid sequence encoding the entire human blood protein, or a portion thereof. 
For example, where human blood protein sequences are used in constructs for use as a 
probe, it may be advantageous to prepare constructs containing only a particular portion 
of the human blood protein encoding sequence, for example a sequence which is 
10 discovered to encode a highly conserved human blood protein region. 

The invention provides, in one embodiment, a seed composition containing a 
flour, extract, or malt obtained from mature monocot seeds and one or more seed- 
produced human blood proteins in unpurified form. Isolating the blood proteins from the 
flour can entail forming an extract composition by milling seeds to form a flour, 
1 5 extracting the flour with an aqueous buffered solution, and optionally, further treating the 
extract to partially concentrate the extract and/or remove unwanted components. In a 
preferred method, mature monocot seeds, such as rice seeds, are milled to a flour, and 
the flour then suspended in saline or in a buffer, such as Phosphate Buffered Saline 
("PBS"), ammonium bicarbonate buffer, ammonium acetate buffer or Tris buffer. A 
20 volatile buffer or salt, such as ammonium bicarbonate or ammonium acetate may 
obviate the need for a salt-removing step, and thus simplify the extract processing 
method. 

The flour suspension is incubated with shaking for a period typically between 30 
minutes and 4 hours, at a temperature between 20-55°C. The resulting homogenate is 

25 clarified either by filtration or centrifugation. The clarified filtrate or supernatant may be 
further processed, for example by ultrafiltration or dialysis or both to remove 
contaminants such as lipids, sugars and salt. Finally, the material maybe dried, e.g., by 
lyophiiization, to form a dry cake or powder. The extract combines advantages of high 
blood-protein yieids, essentially limiting losses associated with protein purification. 

30 In general, the protein once produced in a product of a mature seed can be 

further purified by standard methods known in the art, such as by filtration, affinity 
column, gel electrophoresis, and other such standard procedures. The purified protein 
can then be formulated as desired for delivery to a human patient. More than one 
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protein can be combined for the therapeutic formulation. The protein may be purified 
and used in biomedical applications requiring a non-food administration of the protein. 

The following examples illustrate but are not intended in any way to limit the 
invention. 

5 

EXAMPLE 1 

Production of transgenic rice encoding AAT and fibrinogen polypeptides 

The basic procedures of particle bombardment-mediated rice transformation and 
plant regeneration were carried out as described by Huang ef a/., 2001 . Rice variety 
10 TP309 seeds were dehusked, sterilized in 50% (v/v) commercial bleach for 25 min and 
washed with sterile water. The sterilized seeds were placed on rice callus induction 
medium (RCI) plates containing [N6 salts (Sigma), B5 vitamins (Sigma), 2mg/l 2,4-D 
and 3% sucrose]. The rice seeds were incubated for 10 days to induce callus formation. 
Primary callus was dissected from the seeds and placed on RCI for 3 weeks. This was 
15 done twice more to generate secondary and tertiary callus which was used for 

bombardment and continued subculture. A callus of 1 -4mm diameter was placed in a 
4cm circle on RCI with 0.3M mannitol 0.3M sorbitol for 5-24 hrs prior to bombardment. 
Microprojectile bombardment was carried out using the Biolistic PDC-1000/He system 
(Bio-Rad). The procedure requires 1.5 mg gold particles (60 ug/m!) coated with 2.5 ug 
20 DNA. DNA-coated gold particles were bombarded into rice calli with a He pressure of 
1 1 0Opsi. After bombardment, the callus was allowed to recover for 48 hrs and then 
transferred to RCI with 30 mg/l hygromycin B for selection and incubated in the dark for 
45 days at 26°C. Transformed calli were selected and transferred to RCI (minus 2,4-D) 
containing 5 mg/l ABA, 2 mg/l BAP, 1 mg/l NAA and 30mg/l hygromycin B f or 9-12 
25 days. Transformed calli were transferred to regeneration medium consisting of RCI 
(minus 2,4-D), 3 mg/l BAP, and 0.5 mg/l NAA without hygromycin B and cultured under 
continuous lighting conditions fro 2-4 weeks. Regenerated plantlets (1-3 cm high) were 
transferred to rooting medium whose concentration was half that of MS medium (Sigma) 
plus 1% sucrose and 0.05 mg/l NAA. After 2 weeks on rooting medium, the plantlets 
30 developed roots and the shoots grew to about 10 cm. The plants were transferred to a 
6.5 x 6.5 cm pots containing a mix of 50% commercial soil (Sunshine #1) and 50% soil 
from rice fields. The plants were covered by a plastic container to maintain nearly 1 00% 
humidity and grown under continuous light for 1 week. The transparent plastic cover 
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was slowly shifted over a 1 day period to gradually reduce humidity and water and 
fertilizers added as necessary. When the transgenic R0 plants were approximately 20 
cm in height, they were transferred to a greenhouse where they grew to maturity. 
Individual R1 seed grains from the individual R0 regenerated plants were 
5 dissected into embryos and endosperms. Expression levels of recombinant blood 
proteins (AAT and fibrinogen poypeptides) in the isolated rice endosperms were 
determined. Embryos from the individual R1 grains with high recombinant protein 
expression were sterilized in 50% bleach for 25 min and washed with sterile distilled 
water. Sterilized embryos were placed in a tissue culture tube containing H MS basal 
10 salts with the addition of 1% sucrose and 0.05 mg/l NAA. Embryos were germinated 
and plantlets having ~7 cm shoots and healthy root systems were obtained in about 2 
weeks. Mature R1 plants were obtained as regenerants. 

EXAMPLE 2 

15 Production of rice extract containing recombinant blood proteins and its use in 
parenteral and enteric formulations 
General procedure for production of rice extract 

Transgenic rice containing heterologous polypeptides can be converted to rice 
extracts by either a dry milling or wet milling process. In the dry milling process, 

20 transgenic paddy rice seeds containing the heterologous polypeptides were dehusked 
with a dehusker. The rice was grounded into a fine flour though a dry milling process, 
for example, in one experiment, at speed 3 of a model 91 Kitchen Mill from K-TEC. 
Phosphate buffered saline ("PBS"), containing 0.135 N NaCI, 2.7 mM KCI, 10 mM 
Na 2 HP0 4 , 1.7 mM KH 2 P0 4 , at pH 7.4, with or without additional NaCI, such as 0.35 N 

25 NaCI, was added to the rice flour. In some experiments, approximately 10 ml of 
extraction buffer was used for each 1 g of flour. In other experiments, the initial 
flour/buffer ratio varied over a range such as 1 g/40 ml to 1 g/10 ml. The mixture was 
incubated at room temperature with gentle shaking for 1 hr. In other experiments, the 
incubation temperature was lower or higher, such as from about 22°C to about 60°C, 

30 and the incubation time was longer or shorter, such as from about 10 minutes to about 
24 hr. A Thermolyne VariMix platform mixer set at high speed was used to keep the 
particulates suspended. 
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In place of PBS, other buffers such as ammonium bicarbonate, were used in 
some experiments., in one embodiment, 10 liters of 0.5M ammonium bicarbonate was 
added to 1 kg of rice flour. 

The resulting homogenate was clarified either by filtration or centrifugation. For 
5 the filtration method, the mixture was allowed to settle for about 30 minutes at room 
temperature, after which the homogenate was collected and filtered. Filters in three 
different configurations were purchased from Pall Gemansciences and used. They 
were: a3|im pleated capsule, a 1 .2 /am serum capsule and a Suporcap capsule 50 (0.2 
urn). For centrifugation, a Beckman J2-HC centrifuge was used and the mixture was 
10 centrifuged at 30,000 g at 4°C for about 1 hr. The supernatant was retained and the 
pellet discarded. 

In one embodiment, the filtrate and supernatant were further processed, for 
example by ultra-filtration or dialysis or both to remove components such as lipids, 
sugars and salt. 

15 The filtrate from the above filtration procedure, which is also called the clarified 

extract, was then concentrated using a spiral wound tangential flow filter operated in a 
batch recirculation mode. In one embodiment, PES (polyethersulfone) 3000-4000 
molecular weight cutoff membranes were used for this step. These final concentrated 
extracts were held overnight in a cold room. 

20 The concentrated extracts were next dried to a powder by lyophilization. The 

lyophilized material was scraped from the lyophilizer trays and combined into a plastic 
bag. The dry material was compressed by drawing a vacuum on the bag and then the 
material was blended and the particle size reduced by hand-kneading it through the 
plastic. 

25 Rice extract can also be produced using a wet milling procedure. Transgenic 

paddy rice seeds containing recombinant human blood protein can be re-hydrated for a 
period of 0 to 288 hrs at 30°C. The rehydrated seeds are ground in PBS extraction 
buffer. The initial seed/buffer ratio can vary over a range such as 1 g/40 ml to 1 g/10 
ml. 

30 Over 20% human blood protein can be recovered from the wet milling process. 

The result of the wet milling becomes an initial extract that may be kept cold (4°C) or 
stored frozen until use depending on the stability of the blood protein target. The 
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processing of initial extract to obtain dried extract is the same as that described for dry 
milling in this section. 

EXAMPLE 3 

5 Concentration and diafiltration of recombinant blood protein and control rice extracts. 

The conditions used in concentration and diafiltration vary depending on volume, 
speed,, cost, etc. These conditions are standard in the art based on the description 
herein: The frozen initial extract was thawed in the coldroom (about 2-8°C) for six 
hours. The thawed material was clarified though a 0.45 jLim filter and concentrated 

10 using a 5000 Nominal Molecular Weight Cutoff membrane of Polyethersultone. 

90 ml of the filtrate of control extract was concentrated to 10 ml and additional 10 
ml of deionized water can be added to the concentrated filtrate. The diluted filtrate can 
be diafiltrated one more time using water. The precipitate starts forming at 16 mS and 
increases as the ionic strength decreases. A solution of 1.0M ammonium bicarbonate 

15 was added to the retentate to add ionic strength. The haze decreases although does 
not disappear completely. The material was diafiltered multiple times, in one .. 
embodiment three times, with water and multiple times, in one embodiment three tjmes, 
with 0.1 M ammonium bicarbonate. It was concentrated to 9 ml and the membrane is 
rinsed with 0.1 M ammonium bicarbonate. The concentrate was filtered through 

20 several 0.2 jam button filters. In one embodiment, 2.3 ml of the filtrate is lyophilized as 
is; 2.3 ml of the filtrate is diluted to 12 ml with deionized water and lyophilized, and 2.0 
ml of the filtrate is diluted to 25 ml with deionized water and lyophilized. All the filtrates 
remained clear. 

A total of 89 ml of the filtrate of recombinant protein extract was concentrated to 
25 10 ml, and additional 10 ml of 0.1 M ammonium bicarbonate is added. The resulting 
mixture is concentrated back to 10 ml and another 10 ml of 0.1 M ammonium 
bicarbonate is added. The retentate starts to haze up. The material was diafiltered 
multiple times, in one embodiment three times, with 0.1 M ammonium bicarbonate. It 
was concentrated to 9 ml and the membrane is rinsed with 0.1 M ammonium 
30 bicarbonate. The concentrate was filtered through several 0.45 jiim button filters. In 
one embodiment, 2.0 ml of the filtrate was lyophilized as is; 2.0 ml of the filtrate was 
diluted to 12 ml with deionized water where a haze formed, and lyophilized, and 2.0 ml 
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of the filtrate was diluted to 12 ml with 0.1 M ammonium bicarbonate that remained 
clear, and lyophilized. 

EXAMPLE 4 

5 Comparison of trial extraction of recombinant protein rice with PBS and ammonium 
bicarbonate 

The conditions used in concentration and diafiltration vary depending on volume, 
speed, cost, etc. These conditions are all standard in the art based on the description 
herein. Recombinant protein rice flour is mixed with extraction buffer at about 100 g/L 

10 for about 1 hour using a magnetic stir bar. In one 2L beaker, the extraction buffer is 
PBS, pH 7.4 plus 0.35 M NaCI. In another 2L beaker, the extraction buffer is 0.5 M 
ammonium bicarbonate. A 15 cm Buchner funnel is pre-coated with about 6g of Cel- 
pure C300 before adding another 20g of Cel-pure C300. The mixture is filtered at about 
3-4 Hg. it is then washed twice with about 1 00 ml of respective extraction buffer. The 

15 extracted filtrate is collected and concentrated with ultra-filtration cartridges: 5K 

Regenerate Cellulose, 5K PES, and 1K Regenerated Cellulose. The concentrates are 
lyophilized and analyzed for recombinant blood protein activity contents. The 
ammonium bicarbonate and PBS, pH7.4 plus 0.35 M NaCI both extract approximately 
the same amount of rAAT. There is little loss of recombinant protein units in the 

20 permeate with any of the ultrafiltration units that were used. 

Other extraction buffer can also be used to extract recombinant proteins 
expressed in transgenic rice grains, for example Tris buffer, ammonium acetate, 
depending on applications. 

25 EXAMPLE 5 

Production of rice extracts containing recombinant blood proteins 

The conditions used in concentration and diafiltration vary depending on volume, 
speed, cost, etc. These conditions are all standard in the art based on the description 
herein. All equipment is soaked in hot 0.1 M NaOH at a starting temperature of about 
30 55°C. Rice flour is added to an about 250-500 gal stainless steel tank containing 0.5M 
ammonium bicarbonate in a ratio of 95-1 05 g/L It is mixed for about 60-80 minutes at 
about 9°C. 
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12 plates of 36 inch filter press G300 were pre-coated with about 3-6 kg Cel-pure 
C300. About 1 9-26 g/L of Cel-pure is added to the extract and mixed thoroughly. The 
mixture is pressed at a pressure of about 22 psi at a flow rate of about 82 liters/minute. 
The filtrate is collected into a 250 gal stainless steel tank and washed with 0.5M 
5 ammonium bicarbonate. The press is blown dry. This process is carried out at about 
10°C. 

The 300 NMW cut-off membranes (Polysulfone), which had been cleaned and 
stored with 0.1 M NaOH after control run is rinsed thoroughly with deionized water. The 
extract is concentrated and bumped to a 100 gal stainless steel tank. The membrane 
10 and the concentration tank were flushed with 0.1 M ammonium bicarbonate to recover 
all remaining extract. The products were covered with plastic and left in the 1 00 gal 
tank overnight at room temperature. The concentrate is filtered through spiral wound 1 
lam filter and into 5 gal poly container. 

15 EXAMPLE 6 

Blending of rice extract containing recombinant proteins into parenteral inhalant, 
intranasal and topical formulations. 

Recombinant blood proteins (such as AAT) can be highly purified grains from 
cereal grains for use in medical/pharmaceutical applications. A purification protocol for 

20 rice seed extract expressed human AAT has been developed [Huang et a/, 2002], 

consisting of preparing a rice seed extract according to the above examples and further 
purifying the extract preparation using Con-A, DEAE and Octyl Sepharose 
chromatography respectively. AAT can be purified to greater than 90% homogeneity 
utilizing such a procedure [Huang et a/, 2002], Purified AAT can be utilized in potential 

25 pharma/medical applications for the following indications: AAT 

augmentation/replacement therapy [Sandhaus, 1993; Hubbard et a/, 1989], cystic 
fibrosis [McElvaney et al.1991; Allen,1996j, psoriasis, panniculitis and cutaneous 
vasculitis [O'Riordan et a/, 1997; Dowd et a/, 1995] and pulmonary inflammation [Bingle 
and Tetiey, 1996], For some of these indications, purified AAT protein preparations can 

30 be administered via intravenous (iv) methods in 0.09% saline solution. Alternatively, the 
saline solution solution could be buffered with serum albumen at 0.5% or some other 
pharmacologically acceptable protein carrier molecule. AAT dosages are usually around 
60 mg/kg. For aerosol delivery, an aerosol generating system can be employed utilizing 
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a compressed air driven nebulizer selected on the basis of the basis of its ability to 
generate an aerosol with droplets of the optimum size (< 3um in aerodynamic diameter) 
for deposition in the lower respiratory tract [Hubbard et a/, 1989]. Again proteins can 
either be suspended in sterile water or a buffered saline solution containing a 
5 pharmacologically acceptable protein carrier. Alternatively, a dried protein powder 
containing the purified protein component could be utilized as the dispersal agent and 
this could be an a rice based extract where the AAT component is greater but not less 
then 50% by weight. 

In another case, recombinant rice expressed and extracted human blood proteins 

10 such as AAT and fibrinogen can be employed topically. The use of fibrin 

sealants/bandages has been a widely accepted used by the medical community. Fibrin 
sealants are effective hemostatic agents [Mankad and Codispoti, 2001], a means for 
achieving tissue adhesion, preventing fluid accumulation and promotion of wound 
healing [Spotnitz, 2001]. Fibrin sealants can also be used as a means of slowly 

15 releasing medications, including antibiotics, growth factors and other agents [Spotnitz, 
1997]. Rice expressed fibrinogen can also provide a potential low cost and animal 'virus 
free source for these indications. 
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WHAT IS CLAIMED IS: 

1. A method of producing a recombinant human blood protein in monocot 
plant seeds, comprising the steps of: 

(a) transforming a monocot plant cell with a chimeric gene comprising 

(i) a promoter from the gene of a maturation-specific monocot plant storage 

protein, 

(ii) a first DNA sequence, operably linked to said promoter, encoding a 
monocot plant seed-specific signal sequence capable of targeting a polypeptide linked 
thereto to a monocot plant seed endosperm cell, and 

(Hi) a second DNA sequence, linked in translation frame with the first 
DNA sequence, encoding a human blood protein, wherein the first DNA sequence and 
the second DNA sequence together encode a fusion protein comprising an N-terminal 
signal sequence and the human blood protein; 

(b) growing a monocot plant from the transformed monocot plant cell for a 
time sufficient to produce seeds containing the human blood protein; and 

(c) harvesting the seeds from the plant, 

wherein the human blood protein constitutes at least about 3.0% of the total soluble 
protein in the harvested seeds. 

2. The method of claim 1 , wherein the human blood protein constitutes at 
least about 5.0% of the total soluble protein in the harvested seeds. 

3. The method of claim 1, wherein the human blood protein constitutes at 
least about 10.0% of the total soluble protein in the harvested seeds. 

4. The method of claim 1 , further comprising purifying the human blood 
protein from the harvested seeds. 

5. The method of claim 1 , wherein the human biood protein is selected from 
the group consisting of hemoglobin, alpha-1 -antitrypsin, fibrinogen, human serum 
albumin, thrombin, an antibody, and a blood coagulation factor. 
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6. The method of claim 1 , wherein the human blood protein produced in the 
method comprises one or more plant glycosyl groups. 

7. A purified human blood protein obtained by the method of claim 1 ? wherein 
the human blood protein comprises one or more plant glycosyl groups. 

8. The human blood protein of claim 7, selected from the group consisting of 
hemoglobin, aipha-1 -antitrypsin, fibrinogen, human serum albumin, thrombin, an 
antibody, and a blood coagulation factor. 

9. A monocot plant seed product selected from the group consisting of whole 
seed, flour, extract and malt, prepared from the harvested seeds obtained by the 
method of claim 1, wherein the human blood protein constitutes at least about 3.0% of 
the total soluble protein in the seed product. 

10. The seed product of claim 9, wherein the human blood protein constitutes 
at least about 5.0% of the total soluble protein in the seed product. 

1 1 . The seed product of claim 9, wherein the human blood protein constitutes 
at least about 10.0% of the total soluble protein in the seed product. 

12. The seed product of claim 9, wherein the human blood protein is selected 
from the group consisting of hemoglobin, alpha-1 -antitrypsin, fibrinogen, human serum 
albumin, thrombin, an antibody, and a blood coagulation factor. 

13. A composition comprising a purified human blood protein comprising at 
least one plant glycosyl group and at least one pharmaceutical^ acceptable excipient or 
nutrient, wherein the human blood protein is produced in a monocot plant containing a 
nucleic acid sequence encoding the human blood protein and is purified from seed 
harvested from the monocot plant, and wherein the at least one nutrient is from a 
source other than the monocot plant. 
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14. The composition of claim 13, wherein the human blood protein is selected 
from the group consisting of hemoglobin, alpha- 1 -antitrypsin, fibrinogen, human serum 
albumin, thrombin, an antibody, and a blood coagulation factor. 

15. The composition of claim 13, wherein the composition is substantially free 
of contaminants of the monocot plant. 

16. The composition of claim 13, wherein the at least one nutrient is selected 
from the group consisting of salts, saccharides, vitamins, minerals, amino acids, 
peptides, and proteins other than the human blood protein. 

17. The composition of claim 13, wherein the composition is formulated for 
parenteral, enteric, inhalation, intranasal or topical delivery. 

1 8. The composition of claim 13, wherein the composition is formulated for 
enteric delivery, contains at least one pharmaceutical^ acceptable excipient, and is 
non-nutritional. 
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Figure 1 . Restriction maps of plasmids (pAPI 398), (pAPI 417) and (pAPI 327); containing 
the codon-optomized human fibrinogen genes for a, p and y genes respectively each under 
the control of the rice glutelin promoter. 
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Figure 2. Western blot analysis of independent transgenic rice lines expressing individual 
subunits of human fibrinogen. 
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Figure 3. Expression of the fibrinogen polypeptide subunits ot,p and y simultaneously in 
transgenic rice seeds extracted under non-denaturing and denaturing conditions. 
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Figure 4. Construct showing the chimeric 4/iie for the expression of alpha-1 -antitrypsin in 
transgenic monocot seeds. 
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Figure 5. Coomassie-stained gel of total soluble proteins obtained from transgenic rice 
(van Kitaake) seed extracts. 
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Figure 6. Western blot analysis of recombinant human AAT expressed in transgenic rice 
grains. 
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Figure 7. Activity of recombinant AAT against purified porcine elastase (PPE) as 
determined by Coomassie staining and Western blot analysis. 
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Figure 8. Purification of active AAT from inactive AAT via octyl-sepharose column 
chromotography. 
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Figure 9. Initial biochemical characterization of AAT purified from rice seed extracts. 
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Fig 10. Construct map for the expression of human serum albumen in transgenic rice 
utilizing the Ramy 1 A promoter. 
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Figure 11. Western blot describing the expression of human serum albumen produced in 
transgenic rice seeds. Pooled seed from transgenic rice line 3-11-2 were imbibed in water 
for 24 hours, then 2\M gibberelic acid (GA) added. 
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SEQUENCE LISTING 

<110> HUANG, NING 

RODRIQUEZ, RAYMOND L, 
HAGIE, FRANK E. 
STALKER, DAVID M. 

<120> HUMAN BLOOD PROTEINS EXPRESSED IN MONOCOT SEEDS 

<130> 023231-00003 

<140> 
<141> 

<150> 
<151> 

<160> 21 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 1878 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Codon-optimized 
fibrinogen a-polypeptide coding sequen'ce 

<400> 1 I 

gccgactccg gcgagggcga cttcctcgcc gagggcggc/g gcgtccgggg gccgcgcgtc 60 
gtcgagcggc accagtcggc ctgcaaggac tccgactggc cgttctgctc ggacgaggac 120 
tggaactaca agtgcccctc cggctgccgc atgaaggggc tgatcgacga ggtcaaccag 180 
gacttcacca accgcatcaa caagctcaag aactcgctgt tcgagtacca gaagaacaac 240 
aaggactccc actccctcac gaccaacatc atggagatcc tgcgcggcga cttctcctcc 300 
gcgaacaacc gcgacaacac ctacaaccgc gtctcggagg . acctccgctc ccgcatcgag 360 
gtcctgaagc ggaaggtgat cgagaaggtc cagcacatcc agctcctcca aaagaacgtc 420 
cgcgcccagc tcgtggacat gaagcgcctg gaggtggaca tcgacatcaa gatccggtcg 4 80 
tgccgcggca gctgctcccg cgccctcgcc cgcgaggtgg acctcaagga ctacgaggac 540 
cagcagaagc agctggagca ggtcatcgcc aaggacctcc tcccgagccg cgaccggcag 600 
cacctcccac tgatcaagat gaagccggtg ccagacctgg tccccggcaa cttcaagagc 660 
cagctccaga , aggtcccgcc ggagtggaag gccctcacgg acatgcccca aatgcgcatg 720 
gagctggagc gccccggcgg caacgagatc acgcggggcg gctccacctc gtacggcacg 780 
ggctccgaga ccgagagccc ccgcaacccc tcctccgccg gctcgtggaa ctcggggtcc 840 
agcggccccg gttccacggg caaccgcaac cccggctcgt ccggcaccgg tggcaccgcc 900 
acgtggaagc caggtagctc ggggccgggc agcaccggca ' gctggaactc cggcagcagc 960 
ggcaccggct cgacgggcaa ccagaacccg ggcagccccc gccccggctc ' cacggggacc 1020 
tggaacccag gctcctccga gcgggggtcc gccggccact ggacgagcga gagctcggtg 1080 
tcgggctcga ccggccagtg gcactcggaa tccggcagct tccggccaga ctcccccggc 1140 
agcggcaacg cccggccgaa caacccagac tg'gggcacct tcgaggaggt ctccggtaac 1200 
gtgagccccg gcacgcgccg ggagtaccac acggagaagc tggtgacgtc gaagggcgac 1260 
aaggagctcc ggaccggcaa ggagaaggtg acctccggct cgaccaccac cacccggcgg 1320 
tcctgctcga agaccgtgac gaagaccgtc atcggtccgg acggccacaa ggaggtcacc 1380 
aaggaggtgg tcaccagcga ggacggctcg gactgcccgg aggccatgga cctgggcacc 1440 
ctcagcggca tcggcacgct ggacggcttc cgccaccggc acccggacga ggccgccttc 1500 
ttcgacaccg ctagcaccgg caagaccttc cccggtttct tctcgccgat gctcggcgag 1560 
ttcgtgtccg agacggagag ccggggcagc gagagtggca tcttcaccaa caccaaggaa 162 0 
tcctcctcgc accacccagg tatcgcggag ttcccgagcc gggggaagtc ctcctcctac 1680 
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tccaagcagt tcaccagctc cacctcctac aaccggggcg acagcacgtt cgagagcaag 1740 

agctacaaga tggccgacga ggccggttcc gaggccgacc acgagggcac ccactcgacg 1800 

aagcgcgggc acgccaagtc gcgcccagtg cgcgggatcc acacgtcccc gcgtcgcaag 1860 

cccagcctgt ccccgtga 1878 

<210> 2 
<211> 1386 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Codon-optimized 
fibrinogen B-polypeptide coding sequence 

<400> 2 

caaggggtga acgacaacga ggagggtt'tc ttctccgccc gcggccaccg cccgctggac 60 
aagaagcgcg aggaggcccc gagcctgcgc cccgcgcccc cccccatctc cggcggcggc 12 0 
taccgggccc ggccggcaaa ggcggcggca acgcagaaga aggtcgagcg gaaggccccg 18 0 
gacgccggcg gctgcctgca cgcggacccg gacctcggcg tcctgtgccc aacggggtgc 240 
cagctccaag aggcgctcct ccaacaggag cgccccatcc ggaactccgt agacgagctg 300 
aacaacaacg tggaggcagt gagccagacc tcctcgtcca gcttccagta catgtacctc 360 
ctcaaggacc tctggcagaa gcggcaaaag caggtgaagg acaacgagaa cgtggtgaac 420 
gagtacagct ccgagctcga aaagcaccag ctgtacatcg acgagaccgt gaactcgaac 480 
atccccacga acctccgcgt cctgcgctcg atcctggaga acctccggag caagatccag 54 0 
aagctagaat ccgac'gtgtc ggcccagatg gagtattgcc ggaccccgtg caccgtcagc 600 
tgcaacatcc cggtggtcag cggcaaggag tgcgaggaga tcatccgcaa gggcggcgag 660 
accagcgaga tgtacctcat ccaacccgat tcctccgtca agccataccg ggtgtactgc 720 
gacatgaaca cggagaacgg cgggtggacc gtgatccaga accgccagga cggctccgtg 780 
gacttcggcc gcaagtggga cccgtacaag cagggcttcg gcaacgtggc cacgaacacg 840 
gacgggaaga actactgcgg gctccccggc gaatactggc tgggcaacga caagatctcc 900 
cagctgaccc. gcatgggccc caccgagctg ctcatcgaga tggaggactg gaagggcgac 960 
aaggtgaagg cccactacgg gggcttcacg gtgcagaacg aggcgaacaa gtaccaaatc 1020 
tcggtgaaca agtaccgcgg caccgctggg aacgcgctca tggacggcgc gagccagctg 1080 
atgggcgaga accgcaccat gaccatccac aacggcatgt tcttcagcac ctacgaccgc 1140 
gacaacgacg ggtggctcac gagcgacccc cggaagcagt gctcgaagga ggacggcggc 1200 
ggctggtggt acaaccgctg ccacgcggca aaccccaacg gtcgctacta ctggggcggt 1260 
cagtacacgt gggacatgg-c gaagcacggc accgacgacg gcgtcgtctg gatgaactgg 1320 
aagggctcgt ggtacagcat gcggaagatg tccatgaaga tccgcccctt cttcccccag 1380 
cagtga. 13 8 6 

<210> 3 
<211> 1236 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Codon-optirnized 
fibrinogen y-polypeptide encoding sequence 

<400> 3 

tacgtcgcca cccgggacaa ctgctgcatc ctggacgagc ggttcgggag ctactgccca 60 

accacctgcg gcatcgccga cttcctgtcc acgtaccaga cgaaggtgga caaggacctc 120 

cagtccctgg aggacatcct ccaccaggtg gagaacaaga cgtcggaggt caagcagctc 180 

atcaaggcca tc.cagctcac ctacaacccg gacgaatcgt ccaagcccaa catgatcgac 240 

gccgccaccc tcaagtcgcg gaagatgctg gaggagatca tgaagtacga ggcgtccatc 300 

ctcacccacg actcctccat ccgctacctc caggagatct acaactccaa caaccaaaag 360 

atcgtcaacc tcaaggagaa ggtcgcccag ctggaggcgc aatgccagga gccctgcaag 420 
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gacacggtgc aaatccacga catcacgggg aaggactgcc aagacatcgc caacaagggc 480 

gccaagcaga gcgggctcta cttcatcaag cccctcaagg cgaaccagca gttcctggtc 540 

tactgcgaga tcgacggctc gggcaacggc tggaccgtct tccagaagcg cctcgacggc 600 

tccgtggact. tcaagaagaa ctggatccaa tacaaggagg gcttcggcca cctctccccc 660 

accggcacga cggagttctg gctgggcaac gagaagatcc acctcatctc cacgcagagc 720 

gcgatcccat acgccctccg ggtggagctg gaggactgga acggccgcac cagcaccgcg 780 

gactacgcaa tgttcaaggt gggcccagag gcggacaagt accggctgac ctacgcctac 840 

ttcgcgggcg gggacgcggg ggacgccttc gacgggttcg acttcggtga cgacccaagc 900 

gacaagttct tcacgtccca caacggtatg cagt-tcagca cgtgggacaa cgacaacgac 960 

aagttcgagg gtaactgcgc ggagcaggac ggcagcggct ggtggatgaa ' caagtgccac 102 0 

gcgggccacc tcaacggcgt ctactaccag ggcgggacct acagcaaggc . atccacgcca 1080 

aacgggtacg acaacggtat catctgggcc acgtggaaga cgcgctggta cagcatgaag 114 0 

aagaccacca tgaagatcat cccgttcaac cggctgacca tcggtgaggg ccagcagcac 1200 
cacctcggcg gggccaagca ggcgggcgac gtgtga 1236 



<210> 4 

<211> 1748 

<212> DNA 

<213> Homo sapiens 

<400> 4 

gatgcacaca agagtgaggt tgctcatcgg tttaaagatt tgggagaaga aaatttcaaa 60 
gccttggtgt tgattgcctt tgctcagtat cttcagcagt gtccatttga agatcatgta 12 0 
aaattagtga atgaagtaac tgaatttgca aaaacatgtg ttgctgatga gtcagctgaa 180 
aattgtgaca aatcacttca tacccttttt ggagacaaat tatgcacagt tgcaactctt 240 
cgtgaaacct atggtgaaat ggctgactgc tgtgcaaaac aagaacctga gagaaatgaa 300 
tgcttcttgc aacacaaaga tgacaaccca aacctccccc gattggtgag accagaggtt 360 
gatgtgatgt gcactgcttt tcatgacaat gaagagacat ttttgaaaaa atacttatat 420 
gaaattgcca gaagacatcc ttacttttat gccccggaac tccttttctt tgctaaaagg 480 
tataaagctg cttttacaga atgttgccaa gctgctgata aagctgcctg cctgttgcca 540 
aagctcgatg aacttcggga tgaagggaag gcttcgtctg ccaaacagag actcaagtgt 600 
gccagtctcc aaaaatttgg agaaagagct ttcaaagcat gggcagtagc tcgcctgagc 660 
cagagatttc ccaaagctga gtttgcagaa gtttccaagt tagtgacaga tcttaccaaa 72 0 
gtccacacgg aatgctgcca tggagatctg cttgaatgtg ctgatgacag ggcggacctt 780 
gccaagtata tctgtgaaaa tcaagattcg atctccagta aactgaagga atgctgtgaa 84 0 
aaacctctgt tggaaaaatc ccactgcatt gccgaagtgg aaaatgatga gatgcctgct 900 
gacttgcctt cattagctgc tgattttgtt gaaagtaagg atgtttgcaa aaactatgct 960 
gaggcaaagg atgtcttcct gggcatgttt ttttatgaat atgcaagaag gcatcctgat 1020 
tactctgtcg tgctgctgct gagacttgcc aagacatatg aaaccactct agagaagtgc 1080 
tgtgccgctg cagatcctca tgaatgctat gccaaagtgt tcgatgaatt taaacctctt 1140 
gtggaagagc ctcagaattt aatcaaacaa aattgtgagc tttttgagca gcttggagag 1200 
tacaaattcc agaatgcgct attagttcgt tacaccaaga aagtacccca agtgtcaact 1260 
ccaactcttg tagaggtctc aagaaaccta ggaaaagtgg gcagcaaatg ttgtaaacat 1320 
cctgaagcaa aaagaatgcc ctgtgcagaa gaetatctat ccgtggtcct gaaccagtta 1380 
fgtgtgttgc atgagaaaac gccagtaagt gacagagtca ccaaatgctg cacagaatcc 1440 
ttggtgaaca ggcgaccatg cttttcagct ctggaagtcg atgaaacata cgttcccaaa 1500 
gagtttaatg ctgaaacatt caccttccat gcagatatat gcacactttc tgagaaggag 1560 
agacaaatca agaaacaaac tgcacttgtt gagcttgtga caaggcaaca aaagagcaac 1620 
tgaaagctgt tatggatgat ttcgcagctt ttgtagagaa gtgctgcaag gctgacgata 1680 
aggagacctg ctttgccgag gagggtaaaa aacttgttgc tgcaagtcaa gctgccftag 1740 
gcttataa 1748 



<210> 5 
<211> 1185 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Codon 

optimized alpha-l-antitrypsin coding s&qu&nce 

<400> 5 

gaggacccgc agggcgacgc cgcccagaag accgacacca gccaccacga ccaggaccac 60 
ccgacgttca acaagatcac cccgaatttg gccgaattcg ccttcagcct gtaccgccag 120 
ctcgcgcacc agtccaactc caccaacatc ttcttcagcc cggtgagcat cgccaccgcc 180 
ttcgccatgc tgtccctggg taccaaggcg gacacccacg acgagatcct cgaagggctg 240 
aacttcaacc tgacggagat cccggaggcg cagatccacg agggcttcca ggagctgctc 300 
aggacgctca accagccgga ctcccagctc cagctcacca ccggcaacgg gctcttcctg 360 
tccgagggcc tcaagctcgt cgataagttc ctggaggacg tgaagaagct ctaccactcc 420 
gaggcgttca ccgtcaactt cggggacacc gaggaggcca agaagcagat caacgactac 480 
gtcgagaagg ggacccaggg caagatcgtg gacctggtca aggaattgga cagggacacc 54 0 
gtcttcgcgc tcgtcaacta catcttcttc aagggcaagt gggagcgccc gttcgaggtg 600 
aaggacaccg aggaggagga cttccacgtc gaccaggtca ccaccgtcaa ggtcccgatg 660 
atgaagaggc tcggcatgtt caacatccag cactgcaaga agctctccag ctgggtgctc 720 
ctcatgaagt acctggggaa cgccaccgcc atcttcttcc tgccggacga gggcaagctc 780 
cagcacctgg agaacgagct gacgcacgac atcatcacga agttcctgga gaacgaggac 840 
aggcgctccg ctagcctcca cctcccgaag ctgagcatca ccggcacgta cgacctgaag 900 
agcgtgctgg gccagctggg catcacgaag gtcttcagca acggcgcgga cctctccggc 960 
gtgacggagg aggcccccct gaagctctcc aaggccgtgc acaaggcggt gctcacgatc 1020 
gacgagaagg ggacggaagc tgccggggcc atgttcctgg aggccatccc cgtgtccatc 1080 
ccgcccgagg tcaagttcaa caagcccttc gtcttcctga tgatcgagca gaacacgaag 1140 
agccccctct tcatggggaa ggtcgtcaac cccacgcaga agtga 1185 



<210> 6 
<211> 786 
<212> DNA 
<213> Oryza sp . 

<400> 6 

catgagtaat gtgtgagcat tatgggacca cgaaataaaa agaacatttt gatgagtcgt 60 
gtatcctcga tgagcctcaa aagttctctc accccggata agaaaccctt aagcaatgtg 120 
caaagtttgc attctccact gacataatgc aaaataagat atcatcgatg acatagcaac 180 
tcatgcatca tatcatgcct ctctcaacct attcattcct actcatctac ataagtatct 240 
tcagctaaat gttagaacat aaacccataa gtcacgtttg atgagtatta ggcgtgacac 300 
atgacaaatc acagactcaa gcaagataaa gcaaaatgat gtgtacataa aactccagag 360 
ctatatgtca tattgcaaaa agaggagagc ttataagaca aggcatgact cacaaaaatt 420 
cacttgcctt tcgtgtcaaa aagaggaggg ctttacatta tccatgtcat attgcaaaag 480 
aaagagagaa agaacaacac aatgctgcgt caattataca tatctgtatg tccatcatta 540 
ttcatccacc tttcgtgtac cacacttcat atatcataag agtcacttca cgtctggaca 600 
ttaacaaact ctatcttaac atttagatgc aagagccttt atctcactat aaatgcacga 660 
tgatttctca ttgtttctca caaaaagcgg ccgcttcatt agtcctacaa caacatggca 720 
tccataaatc gccccatagt tttcttcaca gtttgcttgt tcctcttgtg cgatggctcc 780 
ctagcc 786 



<210> 7 
<211> 1055 
<212> DNA 
<213> Oryza sp. 

<400> 7 

ctgcagggag gagaggggag agatggtgag agaggaggaa gaagaggagg ggtgacaatg 60 

atatgtgggg catgtgggca cccaattttt taattcattc ttttgttgaa actgacatgt 120 

gggtcccatg agatttatta tttttcggat cgaatcgcca cgtaagcgct acgtcaatgc 180 

tacgtcagat gaagaccgag tcaaattagc cacgtaagcg ccacgtcagc caaaaccacc 24 0 
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atccaaaccg 
ctggtttttc 
aacttattcc 
caaattaaag 
aaaactacta 
gtcacaaaga 
tcttttatca 
gttatctgcc 
atctttttgc 
gctctttttc 
ccgtgcacac 
atcatcagtt 
gtctgattga 
cggcgctcat 



ccgagggacc 
gattgaagga 
atttcaaaat 
ggccttttta 
ccaacaagca 
gctaagagtt 
gaggcaaacg 
acatacataa 
aagctccaaa 
caaccgatcc 
aaccatggcc 
cattaccaac 
tcatcaatct 
ggcggccatg 



tcatctgcac 
cgaaaatcaa 
attsctgtgag 
aaatagataa 
acatgcgcag 
atccctagga 
taaagccgct 
cttcagaaat 
tcttggaaac 
atgtcaccct 
acaaaaaccc 
aaacaaaaga 
agaggcggcc 
gtggccatct 



tggttttgat 
atttgttgac 
ccatatatac 
ttgccttctt 
"ttacacacat 
caatctcatt 
ctttatgaca 
tacccaacac 
ctttttcact 
caagcttcta 
tataaaaccc 
ggaaaaaaaa 
gcatggctag 
ccggc 



agttgaggga 
aagttaaggg 
cgtgggcttc 
tcagtcaccc 
tttctgcaca 
agtgtagata 
aaaataggtg 
caagagaaaa 
ctttgcagca 
cttgatctac 
catccgatcg 
catatacact 
caaggtcgtc 



cccgttgtat 
accttaaatg 
caatcctcct 
ataaaagtac 
tttccgccac 
catccattaa 
acacaaaagt 
ataaaaaaaa 
ttgtactctt 
acgaagctca 
ccatcatctc 
tctagtgatt 
ttcttcgcgg 



300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1055 



<210> 8 

<211> 976 

<212> DNA 

<213> Triticum sp. 



<4O0> 8 

ctgcaggcca 

gagatcatag 

cattaattga 

ttgacgcgga 

ttagttcatt 

acgtcatagc 

ttgtgttgcc 

gagatggttt 

gaagttttga 

ttgagtgccg 

gaaggataat 

tgcaaacaat 

aaatctgttt 

ctgccctttt 

cggcgtgcac 

ttacaatctc 

agttcactga 



gggaaagaca 
aagaacataa 
actcatttgg 
tttactaaga 
tgtcacggaa 
atagatagat 
tggaaatcca 
atcaatttac 
gtgccgcata 
catatttgcg 
cactcctctt 
acaccagaac 
tgcaagcacc 
ccaaccgatt 
acaaccatgt 
atcatcaccc 
tctaga 



atggacatgc 
gaggttaaac 
gaagtaaaca 
tcctatgtta 
aggtgttttc 
gttgtgagtc 
actaaatgac 
atgttccatg 
tttgcggaag 
gaagcaatgg 
agataaaaag 
taggattaag 
aattgctcct 
ttgtttcttc 
cctgaacctt 
acaacaccga 



aaagaggtag 
ataggagggc 
aaatccatat 
attttagaca 
ataagtccaa 
attggataga 
aagcaacaaa 
caggctacct 
caatggcact 
ctaacagata 
aacagaccaa 
cccattacgt 
tacttatcca 
tcacgctttc 
cacctcgtcc 
gcaccccaat 



gggcagggaa 
ataatggaca 
tctggtgtaa 
tgactggcca 
aactctacca 
tattgtgagt 
acctgaaatg 
tccactactc 
actcgacatg 
catattctgc 
tgtacaaaca 
ggctttagca 
gcttcttttg 
ttcataggct 
ctataaaagc 
ctacagatca 



gaaacacttg 
attaaatcta 
atcaaactat 
aaggtttcag 
acttttttgc 
cagcatggat 
ggctttagga 
gacatggtta 
gttagaagtt 
caaaccccaa 
tccacacttc 
gaccgtccaa 
tgttggcaaa 
aaactaacct 
ccatccaacc 
attcactgac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

976 



<210> 9 
<211> 1009 
<212> DNA 
<213> Oryza sp . 



<400> 9 

ctgcagtaat 

gtaaacgttt 

caaaattttt 

gataattgac 

gaaggtttat 

ttgcaataga 

attgttttaa 

aaaagggctt 

acatgacaac 

ttgtgccaag 

taacttatca 

ataaagatct 



ggatacctag 
tctcgatcgc 
aatataccca 
ttatgacaat 
gttttcttat 
tcaacacaac 
aatagtttta 
attacatgga 
aaagttacta 
taaagatgac 
aggatagttg 
tatcaagtat 



tagcaagcta 
aattttgatc 
acaagagcgt 
gtgattattt 
gctaaagggt 
aaatttgaat 
agaaggatct 
aaattcctta 
gtatgtcaat 
aacaaacata 
gccacgcaaa 
aagaacttta 



gcttaaacaa 
aaaactattg 
ccaaaccaaa 
catcaagtct 
tatgtttata 
gtttccagat 
gatatgcaag 
ttgaatatgt 
aaaaaaatac 
caaatttatt 
aatgacaaca 
tggtgacata 



atctaaattc 
aaaacctcaa 
tatgtaaata 
ttaaatcatt 
taagaatatt 
gtgtaaaaat 
tttgatagtt 
ttcattgact 
aaggttactt 
tgttctttta 
tactttacaa 
aaaaataatc 



caatctgttc 60 
ttaaaccatt 120 
tggatgtcat 180 
aattctagtt 240 
aaagagcaaa 300 
atccaaatta 360 
agtaaactgc 420 
ggtttatttt 480 
gtcaattgta 540 
tagaaacacc 600 
ttgtatcatc 660 
acaagggcaa 720 
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gacacatact aaaagtatgg acagaaattt 
aagcataaga aatgagtcat ggctgagtca 
tttttgttaa gtattgtttt aacactacaa 
ctattaccgt gtatcccaag tggccttttc 
tcaactcaca tcaattagct taagtttcca 



cttaacaaac tccatttgtt ttgtatccaa 780 

tgatatgtag ttcaatcttg caaaattgcc 840 

gtcacatatt gtctatactt gcaacaaaca 900 

attg.ctatat aaactagctt gatcggtctt 960 

ttagcaactg ctaatagct 1009 



<210> 10 
<211> 839 
<212> DNA 
<213> Oryza sp . 



<400> 10 

ctgcagtgta 

atagtattgt 

atttgtgaat 

tggtatttga 

aaaatgcaag 

aaaaaagata 

cacatatgtc 

tgacgtggtt 

gtttaaagtg 

cacgtatagt 

ataaaagcaa 

ctatttatct 

gacattaaca 

accaacatat 



agtgtagctt 
tccaagatga 
atcctatctc 
gagcctttgt 
gattttttta 
tgcataaaat 
ctaaacaaac 
atgacttatt 
atgtcttact 
ccaccttcac 
agaaaaggat 
atcctccttt 
aactctatct 
atcatccatt 



cttatagctt 
aagaataatt 
ttggcttata 
atagctgaaa 
ttctggttca 
taaagtaata 
tgcattttgt 
cacttttt-gt 
gtagaagttt 
gtaattcttt 
atcaaacaga 
tgtgtacctt 
taacatctag 
tctcacaaaa 



agtgctttac 
catccttgct 
atgaaatgtg 
ccaacgtata 
tgccctggat 
aatttgctca 
ttgtcatgta 
gactccaaaa 
catcccaaaa 
gtggaagata 
ccattgtgca 
acttctatct 
tcgatcacta 
gcattgagtt 



tatcttcaca 
accaacttgc 
ctgctgggtt 
tcgagcatgg 
gggttaatat 
taagaaacca 
gcaatacaag 
tgtagtaggt 
gcaatcacta 
acaagaaggc 
tcccattgat 
agtgagtcac 
ctttacttca 
cagtcccaca 



agcacatgct 
atgatattat 
attctgacca 
aacagagaac 
cgtgatcatc 
aaaccaaaag 
agataatata 
ctaactgatt 
aagcaacaca 
tcactgaaaa 
ccttgtatgt 
ttcatatgtg 
ctataaaagg 
aaatctaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

839 



<210> 11 
<211> 1302 
<212> DNA 
<213> Oryza sp. 



<400> 11 

ctgcagagat 

ttgaattaaa 

gcggcgtgtg 

gacgaacggg 

agtggactcg 

tagagcgggc 

ggatttgagt 

ttatcataat 

aataagcagt 

ctgatatatt 

atcaattgta 

cacttaaaaa 

tggaaaaatt 

ttataagtat 

acctacaaat 

aatgacaaca 

agagctttat 

tatggacaga 

gtcatggctg 

tgtgttttaa 

atcccaagat 

aattagctta 



atggattttc 

tggaaaaaga 

gccagcgtgc 

ggaccgagtg 

gtccccgcgc 

agacgcggga 

gacacggccc 

atgtgttttt 

aaaaaggctt 

ttacatgaca 

gtgtatcaag 

aatcaatagg 

acatacacca 

gtcatttaaa 

ttgttatttt 

tgcttacaag 

ggtgcaaaaa 

catttcttta 

agtcatgaaa 

cactacaagc 

gcttttttat 

agtttccata 



taagattaat 
aaaaggaaaa 
tgcgtgcgga 
gaccggacga 
ggtatcccga 
gccgtcctag 
acgtg.tagcc 
caaatagtta 
atgacatggt 
acaaagttac 
taaatgacaa 
caagttatat 
atatgcttta 
aatacaagtt 
gaaggaacac 
ttattatcat 
caaatataat 
acaaactcca 
tgtatagttc 
catatattgt 
tgctatataa 
agcaagtaca 



tgattctctg 

aggggatggc 

cagcgagcga 

ggatgtggcc 

gtggtccact 

gtgcaccgga 

tcacagctct 

aataatatat 

aaaattactt 

aagtacgtca 

caaacctaca 

agtcaataaa 

ttgtccggta 

acttatcaat 

ctaaattatc 

cttaaagtta 

gacaaggcaa 

tttgtattac 

aatcttgcaa 

ctgtacgtgc 

actagcttgg 

aatagctcta 



tctaaagaaa 
ttctgctttt 
acacacgacg 
taggacgagt 
gtctgcaaac 
agcaaatccg 
ccgtggtcag 
ataggcaagt 
acaccaatat 
tttaaaaata 
aatttgctat 
ctgcaagaag 
tattttacaa 
tgtcaagtaa 
aaatatagct 
gactcatctt 
agatacatac 
tccaaaagca 
agttgccttt 
aacaaactat 
tctgtctttg 
ga 



aaaagtatta 

tgggctgaag 

gagcagctac 

gcacaaggct 

acgattcaca 

tcgcctgggt 

atgtgtaaaa 

tatatgggtc 

gcctta.ctgt 

caagttactt 

tttgaaggaa 

gcttatgaca 

gacaacaaag 

atgaaaacaa 

tgctacgcaa 

ctcaagcata 

atattaagag 

ccagaagttt 

ccttttgtac 

atcaccatgt 

aactcacatc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

7 80 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1302 
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<210> 12 
<211> 675 
<212> DNA 
<213> Oryza sp , 



<400> 12 

ctgcagcatc 

ttattttaca 

ttattgtaaa 

aaacaagagt 

attgaaatta 

gcattaccaa 

ccttatcaca 

atcatgtata 

gtgcacctaa 

aaaactaaca 

aatcctcatc 

aacaacaatc 



ggcttaggtg 
aaaatataaa 
gttctacaaa 
gtcaatggaa 
tataattcaa 
aatatatata 
ttgacacata 
tatgatagcc 
cagaatatcc 
ctctaaagca 
atccttcacc 
tagag 



tagcaacacg 
atagatcagt 
gctaatttaa 
caatgaaaac 
agagaataaa 
gcttacaaaa 
aagtgagtga 
acaaagttac 
aaataatatg 
accgatggga 
acaattcaaa 



actttattat 
ccctcaccac 
aagttattgc 
catatgacat 
tccacatagc 
catgacaagc 
tgagtcataa 
tttgatgatg 
actcacttag 
aagcatctat 
tattatagtt 



tattattatt 
aagtagagca 
attaacttat 
actataattt 
cgtaaagttc 
ttagtttgaa 
tattattttt 
atatcaaaga 
atcataatag 
aaatagacaa 
gaagcatagt 



attattatta 
agttggtgag 
ttcatattac 
tgtttttatt 
tacatgtggt 
aaattgcaat 
cttgctaccc 
acatttttag 
agcatcaagt 
gcacaatgaa 
agtagaatcc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

675 



<210> 13 
<211> 1098 
<212> DNA 
<213> Oryza sp. 



<400> 13 

ccaggcttca 

ttgttaaact 

atgtcgtcag 

gtatgggccg 

atgaagagaa 

ggatataaaa 

tcgggggctg 

agggtattac 

cgctcttgcg 

actggcaaag 

aattacccga 

acggatttgg 

taattacttg 

tttattc'ggg 

ttagcaacct 

catcccaatt 

cctataaatc 

aggaaaaaaa 

cgtctccacg 



tcctaaccat 
gacacgtgat 
cagac-agcca 
tcccgtcaga 
gtcaagttga 
ggaggacctt 
actggcactt 
gcttctcagc 
aaccttccac 
gggggcctgc 
tgagaaaggg 
tcgacaccga 
aggattcggt 
ggagaatatc 
aattcagatc 
cccaaaccca 
ccacccgcat 
aaggaggaaa 
tcgccgcc 



tacaggcaag 
ggacaagaat 
tctcccacgt 
aggtgattcg 
aagaaaggga 
gcccacttag 
tgtagcttct 
ggcccgaacc 
agattgggag 
gcggtctccc 
aggggggggg 
tgaggtgtct 
tcctaatttt 
caccctgttt 
agaattgtta 
aactcctctt 
cgagcctatc 
agaaaagagg 



atgttgtatg 
gaccgattgg 
cgcgcctgct 
gatggcagcg 
gggagagatg 
aaaggagagg 
tcataegcga 
tgtatacatc 
cttagaacct 
ggtgaggagc 
gggaaatctg 
taccagttac 
ttacccgatc 
cgctcctaat 
gttagcggcg 
ccagtcgccg 
aagcccaaaa 
aggaaagcga 



aagaagggcg 
tgaccggtct 
tccggtgaaa 
atacaaatct 
gtgcatgtgg 
agaaagcaat 
atccaccaaa 
gcccgtgtct 
cacccagggc 
cccacgctcc 
ccttgtttat 
cacgagctag 
gacttcgcca 
taagatagga 
ttggatccct 
acccaaacac 
aaccacaaac 
agaggttgga 



aacatgcaga 
gacaatggtc 
gtggaggtag 
ccgtccatta 
gatccccttg 
cccagaagaa 
acacaggagt 
tgtgtgtttc 
ccccggccga 
gtcagttcta 
ttacgatcca 
attatagtac 
tggaaaattt 
attgttacga 
cacctcatcc 
gcatccgccg 
caaacgaaga 
gagagacgct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1098 



<210> 14 

<211> 432 

<212> DNA 

<213> Hordeum sp . 



<400> 14 

cttcgagtgc 

agaacaataa 

cagcaaacag 

aaaaaactgt 

aaactgcact 

agccgtgcac 



ccgccgattt 
tcacttctcg 
taccccagaa 
tttgcaaagc 
tgtccaaccg 
atagccatgg 



gccagcaatg 
tagatgaaga 
ctaggattaa 
tccaattcct 
attttgttct 
tccggaatct 



gctaacagac 
gaacagacca 
gccgattacg 
ccttgcttat 
tcccgtgttt 
tcacctcgtc 



acatattctg 
agatacaaac 
cggctttagc 
ccaatttctt 
cttcttaggc 
cctataaaag 



ccaaaacccc 60 
gtccacgctt 120 
agaccgtcca 180 
ttgtgttggc 240 
taactaacac 300 
cccagccaat 3 60 
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ctccacaatc tcatcatcac cgagaacacc gagaaccaca aaactagaga tcaattcatt 420 
gacagtccac eg 432 

<210> 15 

<211> 60 

<212> DNA 

<213> Triticum sp . 



<400> 15 

atggctaagc gcctggtcct etttgeggea gtagtegteg ccctcgtggc tctcaccgcc 60 



<210> 16 
<211> 72 
<212> DNA 
<213> Oryza sp. 

<400> 16 

atggcaacta ccattttctc tegtttttet atatactttt gtgetatget attatgecag 60 
ggttctatgg cc 72 



<210> 17 
<211> 85 
<212> DNA 
<213> Oryza sp, 

<400> 17 

atgtggacat taacaaactc tatcttaaca tctagtcgat cactacttta cttcactata 60 
aaaggaccaa catatatcat ccatt 85 



<210> 18 
<211> 72 
<212> DNA 
<213> Oryza sp. 

<400> 18 

atggcgagtt ccgttttctc teggttttet atatactttt gtgttcttct attatgecat 60 
ggttctatgg cc 72 



<210> 19 
<211> 69 
<212> DNA 
<213> Oryza sp. 

<400> 19 

atgaagatca ttttegtatt tgctctcctt gctattgttg catgcaacgc ttctgeaegg 60 
tttgatget 69 



<210> 20 
<211> 63 
<212> DNA 
<213> Oryza sp . 

<400> 20 
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atggccgccc gcgccgccgc cgccgcgttc ctgctgctgc tcatcgtcgt tggtcaccgc 60 
<? cc 63 



<210> 21 

<211> 63 

<212> DNA 

<213> Hordeum sp. 

<400> 21 

atggctaagc ggctggtcct ctttgtggcg gtaatcgtcg ccctcgtggc tctcaccacc 60 
9 CC 63 

-8- 
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